Final Exam Bus 320 Spring 2000 Russell

Similar documents
Midterm Exam Business Statistics Fall 2001 Russell

Business Statistics Midterm Exam Fall 2015 Russell. Please sign here to acknowledge

Ch 13 & 14 - Regression Analysis

Business 320, Fall 1999, Final

Swarthmore Honors Exam 2012: Statistics

Business Statistics 41000: Homework # 5

SMAM 314 Practice Final Examination Winter 2003

Math 2000 Practice Final Exam: Homework problems to review. Problem numbers

STAT 212 Business Statistics II 1

INFERENCE FOR REGRESSION

Basic Business Statistics 6 th Edition

(4) 1. Create dummy variables for Town. Name these dummy variables A and B. These 0,1 variables now indicate the location of the house.

(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box.

28. SIMPLE LINEAR REGRESSION III

SMAM 314 Exam 42 Name

Chapter 14 Multiple Regression Analysis

University of Chicago Graduate School of Business. Business 41000: Business Statistics

Basic Business Statistics, 10/e

Correlation & Simple Regression

Class 26: review for final exam 18.05, Spring 2014

This document contains 3 sets of practice problems.

Review of Multiple Regression

Name: Exam 2 Solutions. March 13, 2017

Midterm 2 - Solutions

STATISTICS 110/201 PRACTICE FINAL EXAM

AP Statistics Bivariate Data Analysis Test Review. Multiple-Choice

STA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #6

Ph.D. Preliminary Examination Statistics June 2, 2014

Inference with Simple Regression

Confidence Interval for the mean response

MATH 10 SAMPLE FINAL EXAM. Answers are on the last page at the bottom

Correlation and Regression

D. A 90% confidence interval for the ratio of two variances is (.023,1.99). Based on the confidence interval you will fail to reject H 0 =!

Chapter 3 Multiple Regression Complete Example

Section 3: Simple Linear Regression

SMAM 319 Exam1 Name. a B.The equation of a line is 3x + y =6. The slope is a. -3 b.3 c.6 d.1/3 e.-1/3

Expectations and Variance

Multiple Regression Examples

AP Statistics Semester I Examination Section I Questions 1-30 Spend approximately 60 minutes on this part of the exam.

ORF 245 Fundamentals of Engineering Statistics. Final Exam

DSST Principles of Statistics

Six Sigma Black Belt Study Guides

Start with review, some new definitions, and pictures on the white board. Assumptions in the Normal Linear Regression Model

MBA Statistics COURSE #4

23. Inference for regression

This exam contains 13 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.

Stat 501, F. Chiaromonte. Lecture #8

Expectations and Variance

Apart from this page, you are not permitted to read the contents of this question paper until instructed to do so by an invigilator.

Section 2: Estimation, Confidence Intervals and Testing Hypothesis

UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test, October 2013

Correlation and Linear Regression

Final Exam STAT On a Pareto chart, the frequency should be represented on the A) X-axis B) regression C) Y-axis D) none of the above

Chapter 4: Regression Models

Class time (Please Circle): 11:10am-12:25pm. or 12:45pm-2:00pm

SMAM 319 Exam 1 Name. 1.Pick the best choice for the multiple choice questions below (10 points 2 each)

LINEAR REGRESSION ANALYSIS. MODULE XVI Lecture Exercises

Chapter # classifications of unlikely, likely, or very likely to describe possible buying of a product?

Chapter 16. Simple Linear Regression and Correlation

Ecn Analysis of Economic Data University of California - Davis February 23, 2010 Instructor: John Parman. Midterm 2. Name: ID Number: Section:

Model Building Chap 5 p251

Multiple Linear Regression

18.05 Exam 1. Table of normal probabilities: The last page of the exam contains a table of standard normal cdf values.

Interactions. Interactions. Lectures 1 & 2. Linear Relationships. y = a + bx. Slope. Intercept

Linear Regression Communication, skills, and understanding Calculator Use


Inference for Regression Inference about the Regression Model and Using the Regression Line

1. Least squares with more than one predictor

IE 230 Probability & Statistics in Engineering I. Closed book and notes. 120 minutes.

1. A machine produces packets of sugar. The weights in grams of thirty packets chosen at random are shown below.

Oregon Hill Wireless Survey Regression Model and Statistical Evaluation. Sky Huvard

Multiple Regression Methods

Multiple Linear Regression

Models with qualitative explanatory variables p216

(1) The explanatory or predictor variables may be qualitative. (We ll focus on examples where this is the case.)

Bayesian Analysis LEARNING OBJECTIVES. Calculating Revised Probabilities. Calculating Revised Probabilities. Calculating Revised Probabilities

STA220H1F Term Test Oct 26, Last Name: First Name: Student #: TA s Name: or Tutorial Room:

O2. The following printout concerns a best subsets regression. Questions follow.

LC OL - Statistics. Types of Data

Examination paper for TMA4255 Applied statistics

Analysis of Covariance. The following example illustrates a case where the covariate is affected by the treatments.

Part Possible Score Base 5 5 MC Total 50

The point value of each problem is in the left-hand margin. You must show your work to receive any credit, except in problem 1. Work neatly.

S2 QUESTIONS TAKEN FROM JANUARY 2006, JANUARY 2007, JANUARY 2008, JANUARY 2009

Midterm 2 - Solutions

Simple linear regression

Regression Models - Introduction

Homework 1 Solutions

LI EAR REGRESSIO A D CORRELATIO

Chapter 16. Simple Linear Regression and dcorrelation

Chapter 7. Practice Exam Questions and Solutions for Final Exam, Spring 2009 Statistics 301, Professor Wardrop

Statistics Boot Camp. Dr. Stephanie Lane Institute for Defense Analyses DATAWorks 2018

5. Let W follow a normal distribution with mean of μ and the variance of 1. Then, the pdf of W is

STAT-UB.0103 Exam APRIL.11 SQUARE Version Solutions

UNIVERSITY OF TORONTO Faculty of Arts and Science

Practice Questions for Exam 1

STAT 3A03 Applied Regression With SAS Fall 2017

ECON 497 Midterm Spring

Section 4.6 Simple Linear Regression

Lecture 3: Inference in SLR

Transcription:

Name Final Exam Bus 320 Spring 2000 Russell Do not turn over this page until you are told to do so. You will have 3 hours minutes to complete this exam. The exam has a total of 100 points and is divided into three parts. The true and false questions are worth 10 points, the multiple choice are worth 30 points and the long answer questions are worth 60 points. You can use two sides of an 8.5x11 sheet of notes during the exam. Please write clearly and provide answers in the space provided. If you need additional space use the back of the exam pages and clearly organize your work. A table for the cumulative distribution function (cdf) for the t- distribution is attached on the last page. Students in my class are required to adhere to the standards of conduct in the GSB Honor Code and the GSB Standards of Scholarship. The GSB Honor Code also require students to sign the following GSB Honor pledge, "I pledge my honor that I have not violated the Honor Code during this examination. I also understand that discussing the content of this exam with anyone prior to completion by all students would be a violation of the honor code." Please sign here to acknowledge

I. True or False Clearly indicate the best answer by circling T or F indicating that the statement is true or false respectively. If neither T nor F is clearly indicated the problem will be marked as incorrect. Each problem is worth 1 point. 1. T F Shaq, a player for the Los Angeles Lakers basketball team successfully shoots 47% of his free throw shots. Last week he made 9 free throws in 9 attempts. Given that he is a 47% free throw shooter and that he attempts 9 free throws, the probability of this happening in that game was about 1 in 893. 2. T F If X is uniformly distributed such that 2 if a x 2 f (x) = then a=1.5. 0 otherwise 3. T F If X i are iid with a mean of µ and a variance of 45 i= 1 45σ 2 x 2 σ x and are independent then X i is approximately normally distributed with a 45µ and a variance of 4. T F For random samples, the sample average is an unbiassed and consistent estimator for the population mean. 5. T F If X 1,,X n are independent and identically distributed (iid) Bernoulli random variables then if Y = n X i i= 1 then E ( Y ) = np 6. T F If returns for an asset are normally distributed with mean.08 and variance.064 then the probability of a return less than.08 is.25. 7. T F In the multiple regression of Y on X1, X2, X3, the R-squared is the squared correlation between Y and the fitted values Yˆ. 8. T F In a multiple regression the residuals are uncorrelated with the fitted values Yˆ. 9. T F If X and Y are independent then the correlation between X and Y is zero. 10. T F The regression parameter estimates are the parameter values that maximize the R-squared. 2

II. Multiple choice: Clearly circle the answer that is best. Each problem is worth 3 points for a total of 30. No partial credit will be given in this section. If no answer is clearly circled the problem will be marked as incorrect. For questions 1 through 4 consider the output below from a regression in Minitab of the sales price of 150 homes on the square footage and the number of bedrooms. Some of the details of the output have been deleted. In this section, all answers have been rounded to the nearest 100 th. Regression Analysis The regression equation is Price = 49.1 + 4.81 Square_Footage - 0.18 Number_Bedrooms Predictor Coef StDev T P Constant 49.111 6.749 7.28 0.000 Square_F 4.8121 0.5556 8.66 0.000 Number_B -0.178 1.491 0.905 Analysis of Variance Source DF SS MS F P Regression 2 18559.6 9279.8 102.06 0.000 Residual Error 147 13365.6 90.9 Total 149 31925.2 1. The R-squared of the regression is a..12 b..58 c..42 d..72 e. Not enough information is given. 2. The test statistic for the null hypothesis that the coefficient on the number of bedrooms (Number_B) is equal to zero is a. -.12 b..905 c. -.178 d. -2.0 e. Not enough information is given. 3. The estimate of the variance of the error term ε is a. 6.75 b. 102.06 c. 1.1 d. 90.9 e. Not enough information is given. 3

4. Let β 1 and β 2 be the slope coefficients in the regression. You would the null hypothesis that β 1 =β 2 =0 at the level. a. Reject, 5% level b. Fail to reject, 10% level. c. Reject, 1% level d. Both a and d. e. None of the above. 5. Let X have the following density function Find the probability that X<1.5 (ie F(1.5)) a..375 b..75 c..25 d..50 e. None of the above. f.25for 0 x 1 =.50for 1< x < 2.5 0 elsewhere 6. A developer is considering purchasing 2 properties. The return on his investment is uncertain, but he knows that profits are independent across the two properties and each profit is normally distributed. The expected profit from the first investment is 10,000 with a variance of 2000 2. The second has an expected profit of 15,000 with a variance of 3500 2. What is the distribution of total profits on the two projects: a. N(25,000, 4031 2 ) b. N(25,000, 5500 2 ) c. We would need to know the covariance between the profits. d. No longer a Normal distribution. e. None of the above. 7. If you randomly guessed the answers to each of the questions in this multiple choice section (including this one), the mean and variance of the total number of correct answers is: a. mean=2, var=.16 b. mean=.2 var=.16 c. mean=2 var=1.6 d. mean=5, var=2.5 e. None of the above. 8. Smaller confidence intervals result for a. Larger samples. b. Smaller variances of the population we are sampling from. c. Larger confidence levels. d. a and b e. all of the above. 4

9. In 1997 the probability that a new car shopper visits the Honda showroom is 23%. Toyota knows that the probability that a Toyota is sold given that the consumer visited a Honda dealer is 18.5%. The probability that a Toyota is sold given that the consumer did not visit a Honda dealer is 21.1%. What is the probability that the consumer visited the Honda dealer given that the consumer purchased a Toyota? a..2618 b..2300 c..2075 d. Not enough information provided to answer. e. None of the above. 10. Which of the following is the most correct statement about the Central Limit Theorem (CLT)? a. The CLT states that for large, random samples, the sample mean X is always equal to µ. b. The CLT states that for random samples the sample mean X is always equal to µ. c. The CLT states that for large random samples the distribution of the population mean is approximately normally distributed. d. The CLT states that the sum of N iid random variables tends to a Normal distribution as N gets large. e. Both (a) and (d) are correct. 5

III. Long answer questions. Try to do work in the space provided under each question and be show all work in order to facilitate partial credit. Also be sure to place the final answer in the space provided as indicated by the underline when present. 1. The following is summary information about returns of IBM and Exxon. Descriptive Statistics Variable N Mean Median TrMean StDev SE Mean IBM 1011 0.00091 0.00000 0.00078 0.01573 0.00049 Exxon 1011 0.00182 0.00133 0.00150 0.02248 0.00071 Variable Minimum Maximum Q1 Q3 IBM -0.07385 0.05927-0.00972 0.01098 Exxon -0.14953 0.13164-0.01096 0.01300 Covariances IBM Exon IBM 0.00024729 Exxon 0.00006809 0.00050517 Consider the following portfolio: P1=c 1 r f +c 2 IBM+c 3 Exxon Where r f is the risk free rate of return with mean 0.0002 and variance zero. a. Find the mean and variance of the portfolio with weights c 1 =.2 c 2 =.3 and c 3 =.5 b. Find the mean and variance of the portfolio with weights c 1 =-.2 c 2 =.5 c 3 =.7 Notice that c 1 is negative here. This corresponds to borrowing money at the risk free rate and investing that borrowed money in the stocks. Notice also that the weights still sum to one. 6

2. (18 points) A Gallup poll taken in May of this year asked individuals "If you had to choose, which of the following issues is likely to be more important in determining your vote for president this year -- where the candidates stand on abortion or where the candidates stand on gun control?" (http://www.gallup.com/poll/indicators/indguns.asp) Gun control Abortion No opinion Male.12.27.11 Female.15.25.10 a. What is the probability that an individual is a male and finds Gun control to be the more important issue? b. What is the probability that an individual thinks gun control is the more important issue? c. What is the conditional distribution of the more important topic given that the individual is Male (find Pr(topic male) for each topic)? d. Is the sex of the respondent independent of their views on the more important topic? Be sure to formally state why or why not using a definition of independence. e. Interpret the results of part d in plain English. A sentence or two should be sufficient. 7

3. A factory that discharges waste water into the sewage system is required to monitor the arsenic levels in its waste water and report the results to the Environmental Protection Agency (EPA) at regular intervals. Twenty five beakers of waste water from the discharge are obtained at randomly chosen times. The measurement of arsenic is in nonograms per liter for each beaker of water obtained. The summary statistics from minitab are presented below. Descriptive Statistics Variable: Arsenic Anderson-Darling Normality Test A-Squared: P-Value: 1.966 0.000 18 0 40 80 120 160 95% Confidence Interval for Mu 28 38 48 95% Confidence Interval for Median Mean StDev Variance Skewness Kurtosis N Minimum 1st Quartile Median 3rd Quartile Maximum 19.658 95% Confidence Interval for Sigma 24.398 43.468 20.740 32.5560 31.2459 976.308 2.33222 6.15526 25 3.200 15.050 24.000 37.550 152.700 95% Confidence Interval for Mu 45.454 95% Confidence Interval for Median 35.725 a. The company claims that the mean output of arsenic is 20 nonograms per liter. Stating any necessary assumptions, test the null hypothesis at the 5% level against the two-sided alternative. Assumptions H 0 H a Conclusion? b. What is the p-value associated with the above test? 8

c. Use the p-value from the Andersen Darling test statistic (given near the top of the output on the right hand side) to the null hypothesis that the data are normally distributed. Do the test at the 5% level. H 0 H a Conclusion d. What implications, if any, does the normality test result have on your test in part a? 9

4. Let X i denote the number of heads on two tosses of a coin. So, X i can take the values 0, 1, or 2. a. What is the probability of each outcome? Note that you can find the probability of no heads and the probability that both outcomes are heads. The probability of just one head must be 1 minus the probability of the other two outcomes. b. Let Y=X 1 +X 2 +X 3 where the X i are iid. What is the mean of Y? c. What is the variance of Y? d. What are the possible values that the random variable Y can take? e. What is the distribution of Y (be as explicit as possible)? 10

5. Consider the regression output of sales price of a home on the number of bedrooms. The regression equation is Price = 102 + 10.2 Number_Bedrooms Predictor Coef StDev T P Constant 102.188 3.463 29.51 0.000 Number_B 10.158 1.095 9.28 0.000 S = 11.68 R-Sq = 36.8% R-Sq(adj) = 36.3% a. Test the null hypothesis that the Number of bedrooms (Number_B) has no impact on the sales price of the home at the 5% level. Now consider the multiple regression output of Sales price on number of bedrooms and square footage. The regression equation is Price = 49.1 + 4.81 Square_Footage - 0.18 Number_Bedrooms Predictor Coef StDev T P Constant 49.111 6.749 7.28 0.000 Square_F 4.8121 0.5556 8.66 0.000 Number_B -0.178 1.491-0.12 0.905 S = 9.535 R-Sq = 58.1% R-Sq(adj) = 57.6% b. Test the null hypothesis that the Number of bedrooms (Number_B) has no impact on the sales price of the home at the 5% level using the multiple regression output. c. What is the interpretation of the coefficient on Number_B in the multiple regression? (use a couple of explicit sentences). 11

d. What is the intuition behind the reduction in the test statistic for the coefficient on the Number of bedrooms when the multiple regression is run instead of the simple linear regression? (again, a couple of explicit sentences should be sufficient). Next, I regressed the sales price on the square footage The regression equation is Price = 49.5 + 4.76 Square_Footage Predictor Coef StDev T P Constant 49.502 5.881 8.42 0.000 Square_F 4.7592 0.3320 14.33 0.000 S = 9.504 R-Sq = 58.1% R-Sq(adj) = 57.8% The following is the output from Minitab predicting the sales value of the home from the given that the square footage is 18 (1800 square feet). Predicted Values Fit StDev Fit 95.0% CI 95.0% PI 135.167 0.790 ( 133.606, 136.727) ( 116.322, 154.012) e. What is the expected sales price given the house has 1800 square feet? f. Explain in English what the 95% CI is. g. Explain in English what the 95% PI is. h. Use the plug in method to calculate the an approximate value for the 95% PI 12

6. Consider the following output from regressing IBM returns on S&P500 returns. Recall that the slope coefficient in this regression model is referred to as IBM's "beta". The regression equation is ibm =0.000810 + 1.10 s&p Predictor Coef StDev T P Constant 0.0008097 0.0006005 1.35 0.178 s&p 1.09772 0.05486 20.01 0.000 S = 0.01903 R-Sq = 28.4% R-Sq(adj) = 28.3% a. Test the null hypothesis that IBM's beta is 1.0 at the 5% level. b. Build a 95% CI for IBM's beta using the above regression output. 7. In May of 2000 511 people out of 1000 surveyed nationwide said that they thought that it was all right for state capitols to fly the confederate flag. a. What is the estimate pˆ of the true population proportion that thinks it is "all right" for state capitols to fly the flag? b. Build a 95% CI for the true population parameter. c. Test the null hypothesis that the true value of p is.5 at the 5% level. 13

8. Consider the following regression of Y on X. The regression equation is y = 2.80 + 0.252 x Predictor Coef StDev T P Constant 2.8035 0.1011 27.72 0.000 x 0.2523 0.1134 2.23 0.028 S = 1.015 R-Sq = 4.8% R-Sq(adj) = 3.8% The following plot is of the residual versus X. Residuals Versus x (r esponse is y) 4 3 2 Residual 1 0-1 -2-2 -1 0 1 2 x a. What assumption of the regression model appears to be violated here? The next plot is a simple scatter plot of Y versus X. 8 7 6 5 y 4 3 2 1 0-2 -1 0 x 1 2 b. After viewing the scatter plot, why should we not expect the regression model of Y on X estimated above to be a good model? 14

15