Business Statistics 41000: Homework # 5

Similar documents
Inference with Simple Regression

Final Exam Bus 320 Spring 2000 Russell

Chapter 10. Correlation and Regression. McGraw-Hill, Bluman, 7th ed., Chapter 10 1

Business 320, Fall 1999, Final

Midterm 2 - Solutions

value of the sum standard units

Regression Models. Chapter 4. Introduction. Introduction. Introduction

Inferences for Regression

Hypothesis testing I. - In particular, we are talking about statistical hypotheses. [get everyone s finger length!] n =

Chapter 3 Multiple Regression Complete Example

t-test for b Copyright 2000 Tom Malloy. All rights reserved. Regression

Section 2: Estimation, Confidence Intervals and Testing Hypothesis

Mathematical Notation Math Introduction to Applied Statistics

Applied Regression Analysis. Section 2: Multiple Linear Regression

Lecture 14. Analysis of Variance * Correlation and Regression. The McGraw-Hill Companies, Inc., 2000

Lecture 14. Outline. Outline. Analysis of Variance * Correlation and Regression Analysis of Variance (ANOVA)

Section 4: Multiple Linear Regression

Regression, part II. I. What does it all mean? A) Notice that so far all we ve done is math.

Business Statistics. Lecture 10: Course Review

Statistical Inference

Interactions. Interactions. Lectures 1 & 2. Linear Relationships. y = a + bx. Slope. Intercept

MITOCW ocw f99-lec09_300k

Simple linear regression

MITOCW ocw f99-lec30_300k

Chapter 16. Simple Linear Regression and Correlation

Homework 1 Solutions

Chapter 4: Regression Models

Section 3: Simple Linear Regression

ECON 497 Midterm Spring

Section 5.6 from Basic Mathematics Review by Oka Kurniawan was developed by OpenStax College, licensed by Rice University, and is available on the

The Purpose of Hypothesis Testing

Chapter 16. Simple Linear Regression and dcorrelation

The Inductive Proof Template

ECO220Y Review and Introduction to Hypothesis Testing Readings: Chapter 12

where Female = 0 for males, = 1 for females Age is measured in years (22, 23, ) GPA is measured in units on a four-point scale (0, 1.22, 3.45, etc.

1: a b c d e 2: a b c d e 3: a b c d e 4: a b c d e 5: a b c d e. 6: a b c d e 7: a b c d e 8: a b c d e 9: a b c d e 10: a b c d e

CLASS NOTES: BUSINESS CALCULUS

Chapter 4. Regression Models. Learning Objectives

A discussion on multiple regression models

CENTRAL LIMIT THEOREM (CLT)

Business Statistics 41000: Homework # 2 Solutions

Econometrics Homework 1

MITOCW ocw f99-lec23_300k

MULTIPLE REGRESSION ANALYSIS AND OTHER ISSUES. Business Statistics

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit

Note: Please use the actual date you accessed this material in your citation.

Contingency Tables. Contingency tables are used when we want to looking at two (or more) factors. Each factor might have two more or levels.

Keller: Stats for Mgmt & Econ, 7th Ed July 17, 2006

Business Statistics: Lecture 8: Introduction to Estimation & Hypothesis Testing

Review of Multiple Regression

Part Possible Score Base 5 5 MC Total 50

Data Analysis and Statistical Methods Statistics 651

This gives us an upper and lower bound that capture our population mean.

1 A Non-technical Introduction to Regression

Basic Business Statistics, 10/e

Math 101: Elementary Statistics Tests of Hypothesis

Lecture 4 Scatterplots, Association, and Correlation

PHYSICS 107. Lecture 3 Numbers and Units

Lecture 4 Scatterplots, Association, and Correlation

SIMPLE REGRESSION ANALYSIS. Business Statistics

Basic Business Statistics 6 th Edition

LECTURE 5. Introduction to Econometrics. Hypothesis testing

Chapter 14 Student Lecture Notes 14-1

MITOCW MITRES18_005S10_DiffEqnsGrowth_300k_512kb-mp4

Regression Analysis and Forecasting Prof. Shalabh Department of Mathematics and Statistics Indian Institute of Technology-Kanpur

determine whether or not this relationship is.

1 Correlation and Inference from Regression

Statistics and Quantitative Analysis U4320

MITOCW ocw f99-lec17_300k

Statistics for IT Managers

Ch. 16: Correlation and Regression

Mathematics Level D: Lesson 2 Representations of a Line

Expectations and Variance

Statistical Foundations:

Probability and Statistics

Correlation & Simple Regression

Algebra 1 Fall Semester Final Review Name

Business Statistics. Lecture 9: Simple Regression

Correlation Coefficient: the quantity, measures the strength and direction of a linear relationship between 2 variables.

A particularly nasty aspect of this is that it is often difficult or impossible to tell if a model fails to satisfy these steps.

appstats27.notebook April 06, 2017

Addition and Subtraction of real numbers (1.3 & 1.4)

Example 1: Dear Abby. Stat Camp for the MBA Program

INTRODUCTION TO ANALYSIS OF VARIANCE

Chapter 26: Comparing Counts (Chi Square)

Explanation of R 2, and Other Stories

STA441: Spring Multiple Regression. This slide show is a free open source document. See the last slide for copyright information.

y n 1 ( x i x )( y y i n 1 i y 2

MITOCW ocw f99-lec16_300k

Inference for Regression Simple Linear Regression

In the previous chapter, we learned how to use the method of least-squares

Econ 325: Introduction to Empirical Economics

Midterm 2 - Solutions

THE ROYAL STATISTICAL SOCIETY 2008 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE (MODULAR FORMAT) MODULE 4 LINEAR MODELS

Graphing Equations Chapter Test

79 Wyner Math Academy I Spring 2016

Ch 3: Multiple Linear Regression

Binary Logistic Regression

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur

Regression, Part I. - In correlation, it would be irrelevant if we changed the axes on our graph.

Transcription:

Business Statistics 41000: Homework # 5 Drew Creal Due date: Beginning of class in week # 10 Remarks: These questions cover Lectures #7, 8, and 9. Question # 1. Condence intervals and plug-in predictive intervals There is concern about the speed of automobiles traveling over a particular stretch of highway. For a random sample of n = 7 automobiles, radar indicated the following speeds measured in miles per hour 79, 73, 68, 77, 86, 71, 69 Suppose we're willing to assume that the speed of each automobile is i.i.d. Normal. (a) Find the sample mean and sample standard deviation. (If you want just type the data into Excel, but try to do parts (b)-(c) by hand.) x = 74.71 s x = 6.40 n = 7 (b) Using our usual x ± 2 S.E. formula, nd a 95% condence interval for the mean speed of all 1

automobiles travelling over this stretch of highway. The condence interval is: x ± 2S.E. = 74.71 ± 2 6.40 7 = 74.71 ± 4.838 = (69.88, 79.55) (c) Construct a 95% plug-in predictive interval for the speed of the next car that is clocked over this same stretch of highway. The predictive interval is: x ± 2s x = 74.71 ± 2 6.40 = 74.71 ± 12.8 = (61.92, 87.51) (d) (OPTIONAL: It is easy.) Since this is a small sample, we should really use the Student's t distribution to make our intervals in parts (b) and (c). Recompute both intervals using the appropriate tval from Excel instead of 2. What happens to your intervals? The 95% condence interval is: x ± tval S.E. = 74.71 ± 2.45 6.40 7 = 74.71 ± 5.926 = (68.80, 80.63) The 95% predictive interval is: (59.06, 90.36)

tval = TINV(0.05,6) = 2.45 The width of both intervals has increased. (e) Suppose instead of seeing 7 cars, we had seen a sample of n = 70 cars with the same sample mean and sample variance as in part (a). How do your answers to (b) and (c) change? The 95% condence interval is: x ± 2S.E. = 74.71 ± 2 6.40 70 = 74.71 ± 1.529 = (73.19, 76.24) If the sample mean and sample standard deviation are the same, the 95% plug-in predictive interval would not change. (f) Now suppose I am no longer willing to assume that the speed of each car is normally distributed (I still think they are i.i.d). Are the intervals you constructed in (b) and (c) still valid? Does your answer depend on whether n = 7 or n = 70? This is a hard question because it requires you to think carefully about the assumptions we've made and the role they play in interpreting condence intervals and predictive intervals. I'll give you the full explanation. For the condence intervals, we generally don't need to assume the data are normally distributed. This is because of the Central Limit Theorem, which tells us that in a reasonably-sized sample, averages of i.i.d. random variables are normal. I've deliberately left open the question of what is a `reasonably - sized sample' (if you want, you can always fall back on most people's favorite n > 30 rule of thumb). But keep in mind, if we are willing to assume the data are normal it's

a moot point, because any linear function of normal random variables is normal (and the average is our favorite linear function.). In any case, though, 7 observations is NOT enough to apply the CLT. So, the basic point is: for n = 7 observations, our condence interval is valid only if we assume the data are normally distributed. For n = 70 observations, our condence interval is (approximately) correct regardless of whether the data is normal, due to the Central Limit Theorem. Now, what about the predictive interval? This is part of the reason I've emphasized in class that while the formulas for the condence interval and predictive interval look similar, the two are very dierent because they are asking dierent questions. In particular, the condence interval is telling you (approximately) how close our estimator x is to the actual unknown µ, while the predictive interval is asking us to give a range for a single random outcome. And that's exactly why when we're talking about predictive intervals, the CLT does not apply! The Central Limit Theorem is a statement about averages, and there's no averaging going on when we're trying to predict a single random outcome. Therefore, regardless of whether n = 7 or 70 (or 7 million), we need to assume the data are (at least approximately) normal for our plug-in predictive interval to be correct! Of course we can make predictions for non-normal data: the predictive interval can be generalized to other types of distributions (like for our bank arrival data). In any case, however, we need to assume a probability model for the data in order to make predictions!

Question # 2. Hypothesis tests and p-values In a random a sample of 160 business school graduates, 72 agreed with the statement, A reputation of ethical conduct is less important for a manager's chances of promotion than a reputation for making money for the rm. (a) Test the null hypothesis (at the 5% level) that the true proportion that would agree is.5. We have p 0 = 0.5, ˆp = 72/160 = 0.45, and n = 160. Therefore, the test statistic is: z = 0.45 0.5 0.5 (1 0.5) 160 (b) Find the p-value for the test in (a). = 1.25 fail to reject at 5% level, since z < 2. = 2 (1 NORMDIST(ABS( 1.26), 0, 1, 1) = 0.207669494) Again, we fail to rejectat the 5% level, as the p-value is larger than.05.

Question # 3. Condence intervals and p-values Suppose you are working for a certain candidate in an election. The candidate claims that 70% of voters support him. (a) You get a random sample of 20 voters. You get a sample proportion of ˆp = 0.5 voters that support the candidate. Test the null hypothesis that p = 0.7 (at the 5% level). Does it make sense to say you have proven or accepted the null hypothesis? We have p 0 = 0.7, ˆp =.5, and n = 20. This means the test statistic is: z = 0.5 0.7 0.7 (1 0.7) 20 We fail to reject. = 1.9518 The 95% condence interval is: ˆp ± 2S.E. = 0.5 (1 0.5) 0.5 ± 2 20 = 0.5 ± 0.223606 = (0.2763932, 0.7236068) Remember, we never accept the null hypothesis. Instead, we fail to reject it. (b) Now suppose you have a random sample of 10,000 voters and ˆp = 0.71. Test the null hypothesis that p = 0.7 (at the 5% level). Compute the condence interval for p. Do you reject the null? Does it necessarily make sense to tell the candidate that he is wrong? The test statistic is z = 2.182179 so we would reject at level.05. The 95% condence interval is: ˆp ± 2S.E. = 0.71 (1 0.71) 0.71 ± 2 10000 = 0.71 ± 0.009075 = (0.7009248, 0.7190752)

So, does it necessarily make sense to tell the candidate that he is wrong? Probably not. Even though we formally reject the data tells us that as a practical matter the candidate is right, p is basically 0.7. Also, if anything, the actual value of p is higher than 0.7.

Question # 4. Condence intervals and hypothesis tests for IID normal data In class we focused on hypothesis tests for p with i.i.d. Bernoulli(p) data. Many of the hypothesis tests you will encounter are for µ with i.i.d. Normal data. Suppose we have a sample of observations X i and we assume that X 1, X 2,..., X n i.i.d. Normal(µ, σ 2 ). (a) Suppose we want to test the null hypothesis H 0 : µ = µ 0 where µ 0 is just some number, our guess for the unknown mean µ. Also suppose we know the variance σ 2. Under the null hypothesis, what is the distribution of the sample mean? Another way to ask this question is, what is the sampling distribution of the sample mean? Before we see the data, we are treating the observations in our sample X i as i.i.d. random variables. So the sample mean is also a random variable. On a previous homework, we worked out the mean and variance of a sample average: E [x] = E = 1 n = 1 n [ 1 n n i=1 ] n x i i=1 n µ i=1 E [x i ] = 1 n n µ = µ V [x] = V [ 1 n ] n x i i=1 = 1 n n 2 V [x i ] i=1 = 1 n 2 n i=1 σ 2 = 1 n 2 n σ2 = σ2 n

Now we know the mean and variance of x and we can apply the Central Limit Theorem, which tells us that averages are normally distributed when n is large. ( ) When n is large, the sample mean is distributed as x N µ, σ2 n. Now look at your answer to part (a). Suppose that we take as given that for large n, it doesn't matter that we don't know the variance σ 2 ; we can just replace it with the sample variance, s 2 X (just like we did for the condence interval). Under this assumption the test statistic z = X µ0 N(0, 1) S.E.(X) Why? If you got the right answer to part (a), you should see that we are just taking X, subtracting its mean, and dividing by its standard deviation (since dividing by S.E.(X) is just like dividing by σ n ). In other words, we are just standardizing X. So just like the notes, z is telling you, if our claim about µ is true, how many standard deviations is X away from its mean?. Recognize that this test statistic has the same distribution N(0, 1) as the Bernoulli example we used in Lecture # 8. We can do accept/reject decisions and compute p-values the SAME way!! (b) Lets look at the Canadian returns data again. Assuming the returns are i.i.d., write down a 95% condence interval for µ. Now test the null hypothesis that µ = 0.01 at the 5% level. Do you reject the null hypothesis? The 95% condence interval is x ± 2 S.E. = x ± 2 sx n or in this example: 0.00907 ± 2(0.00371) = (0.00172, 0.01641) The test statistic for our hypothesis test of H 0 : µ = 0.01 is x µ0 S.E.(x) or in this example.00907.01.00371 = 0.2507. We fail to reject the null hypothesis. (c) Using the same data, test the following two null hypotheses at the 5% level: 1. H 0 : µ = 0 2. H 0 : µ = 0.018

For each one, do you reject or fail to reject? The appropriate z-values are: 1. 0.00907 0 0.00371 = 2.445 2. 0.00907 0.018 0.00371 = 2.407 In both cases, we reject H 0. Stop and think about this for a minute. A condence interval and a hypothesis test are two ways of saying what we think about the true value of µ. Remember the formula for our 95% CI: X ± 2 S.E.(X) Now look at the formula for the test statistic: Z = X µ0 S.E.(X) (d) Look at the test statistic we used to test H 0 : µ = µ 0. The test statistic compares two values for µ: the value we `guessed', µ 0, and the value we actually estimated from our data, x. The dierence between the two values gets divided by S.E.(x). So this is like asking, how many standard errors apart are x and µ 0? If x and µ 0 are more than two standard errors apart, we will get a z-value greater than 2 and will reject H 0 at the 95% level. But now look at the formula for the condence interval again-the 95% CI contains all the values within two standard errors of x. If our `guess' µ 0 is outside the 95% CI, we will reject H 0!

Question # 5. Understanding the simple linear regression model Y i = 2 + 3.5X i + ε i ε i N (0, 9) (a) What is E(Y X = 0)? How about E(Y X = 1)? E(Y X = 0) = 2 + 3.5 0 = 2 E(Y X = 1) = 2 + 3.5 1 = 1.5 (b) What is V (Y X)? V (Y X) = V (ε X) = 9 (c) Compute a 95% prediction interval for Y given X = 1. The prediction interval uses the fact that the error term is normally distributed. Consequently, 95% of the probability is within two standard deviations of the mean. ( 2 + 3.5 1 2 3, 2 + 3.5 1 + 2 3) = ( 4.5, 7.5) (d) Compute a 95% prediction interval for the average of two Y values (i.e., X 1 = 1 and X 2 = 1. Y 1 +Y 2 2 ) given that The average of two Y values is a linear combination of two normal random variables. In Lecture #5, we learned that linear combinations of normal random variables are normally distributed. We can use our formulas for linear combina-

tions to calculate the mean and variance of the average. ( ) Y1 + Y 2 E X 1 = 1, X 2 = 1 2 = 1 2 ( 2 + 3.5 1) + 1 ( 2 + 3.5 1) 2 = 1.5 V ( Y1 + Y 2 2 ) X 1 = 1, X 2 = 1 = ( ) 1 2 9 + 2 = 2.25 + 2.25 = 4.5 ( ) 1 2 9 2 (e) What is Pr(Y < 8 X = 2)? Y N ( 2 + 3.5 2, 9) = N (5, 9) = 0.8413

Question # 6. The simple linear regression model Suppose we are modeling house price as depending on house size. Price is measured in thousands of dollars and size is measured in thousands of square feet. Suppose our model is: P = 20 + 50x + ε, ε N(0, 15 2 ) (a) What are the units of the intercept α (the 20) and the standard deviation of the errors σ (the 15)? Thousands of dollars. The same units as P. (b) What are the units of the slope β (the 50)? Thousands of dollars per thousand square feet. (c) Suppose you know a particular house has size x 1 = 1.6. What is the conditional distribution of its price, P 1, given that size? Give a 95% predictive interval for the price of the house. The conditional distribution is: P (P 1 X 1 = 1.6) = N (20 + 50 1.6, (15) 2) = N (100, (15) 2) The 95% predictive interval is: 20 + 50 1.6 ± 2 15 = 100 ± 30 = (70, 130) Now suppose another house has size x 2 = 2.2. What is the conditional distribution of its price, P 2, given that size? Give a 95% predictive interval for the price of the house.

The conditional distribution is: P (P 2 X 2 = 2.2) = N (20 + 50 2.2, (15) 2) = N (130, (15) 2) The 95% predictive interval is: 20 + 50 2.2 ± 2 15 = 130 ± 30 = (100, 160) (d) Let's call the rst house in (c) house 1, with size x 1 = 1.6 and price P 1. The second house in (c) has size x 2 = 2.2 and price P 2. What is the distribution of the dierence in prices: P 2 P 1 given that we know the sizes are x 2 = 2.2 and x 1 = 1.6 as above. What assumptions are you making about the errors? The mean and variance of the price dierences are: E[P 2 P 1 ] = E[P 2 ] E[P 1 ] V [P 2 P 1 ] = V [P 2 ] + V [P 1 ] + 2 COV(P 1, P 2 ) Now, we assume that the errors are independent. This means COV(P 1, P 2 ) = 0. Therefore, using the results we just derived we know that P 2 P 1 N(130 100, 450) N(30, 450)

Question # 7. The simple linear regression model For this question, use the beer data (beer.xls) we talked about earlier in Lecture # 2. We want to know how the number of beers a student claims to be able to drink is related to the student's body weight. Suppose we believe this relationship is B i = α + βw i + ε i ε i N(0, σ 2 ) i.i.d. Here B i is the number of beers the i-th student claims to be able to drink before becoming intoxicated and w i is the i-th student's body weight in pounds. (a) Make a scatter plot of nbeer versus weight. Does a linear relationship seem reasonable? Do you think the slope in this linear relationship is positive or negative? 20.0 17.5 15.0 12.5 Beers 10.0 7.5 5.0 2.5 100 110 120 130 140 150 160 170 180 190 200 210 220 230 Weight A linear relationship looks reasonable. We expect the slope to be positive. (b) Run this regression in Excel. Suppose we have a student who weighs 150 pounds. What is the 95% plug-in predictive interval for the number of beers this student will claim to be able to drink? Here is the output from Excel.

Results of simple regression for nbeer Summary measures Multiple R 0.6921 R-Square 0.4789 StErr of Est 2.7598 ANOVA table Source df SS MS F p-value Explained 1 336.0318 336.0318 44.1188 0.0000 Unexplained 48 365.5932 7.6165 Regression coefficients Coefficient Std Err t-value p-value Lower limit Upper limit Constant -7.0207 2.2133-3.1721 0.0026-11.4708-2.5706 weight 0.0929 0.0140 6.6422 0.0000 0.0648 0.1210 The 95% plug in predictive interval is a + b x ± 2 s e. a + b x ± 2 s e = 7.02 + 0.0929 150 ± 2 2.75 = (1.39, 12.34) (c) Based on this data, do we believe there may be a signicant relationship between body weight and claims about drinking capacity? Explain. Yes. If we test the null hypothesis H 0 : β = 0 we get a p-value that is zero to at least four decimal places (Excel outputs this automatically). In other words, there is signicant evidence that claims about drinking ability are related to body weight. (d) Someone makes the statement, For each additional 10 pounds of body weight, a person will generally claim s/he can drink one additional beer before becoming intoxicated. Using a hypothesis test, does the evidence in the data support this claim? The hypothesis we're interested in here is H 0 : β = 1/10. (Since the slope is measured in beers per pound, this is one more beer per ten additional pounds of body weight.)

The test statistic is: t = b β0 s b = b 1 10 s b = 0.0929 1 10 0.014 = 0.508 We fail to reject at the 5% level. This means that there is no conclusive evidence against this claim (though we still can't say for sure it is true!). (e) Insert a new worksheet and copy the rst four (nbeer, weight) data points over to the new worksheet. Run the regression again using only the rst four observations. Are the condence intervals for α and β wider or narrower than those we calculated using all the data? Why? See below. (f) Using the regression you ran in part (e), test the null hypothesis H 0 : β = 0. Does this prove that weight has no eect on claimed drinking capacity? nbeer weight Results of multiple regression for nbeer 12 192 12 160 Summary measures 5 155 Multiple R 0.7541 5 120 R-Square 0.5686 Adj R-Square 0.3529 StErr of Est 3.2510 ANOVA Table Source df SS MS F p-value Explained 1 27.8624 27.8624 2.6363 0.2459 Unexplained 2 21.1376 10.5688 Regression coefficients Coefficient Std Err t-value p-value Lower limit Upper limit Constant -7.7057 10.1124-0.7620 0.5257-51.2159 35.8046 weight 0.1034 0.0637 1.6237 0.2459-0.1706 0.3774 Note how much bigger the standard error s b is using only four data points (it was about 0.014 before). This is because with only four data points, we are not able to measure the relationship between nbeer and weight very well.

We now fail to reject H 0 : β = 0 at the 5% level. However it is silly to say we have proven there is no relationship between beers and weight. It's simply that with only four data points we aren't measuring this relationship accurately. This is why we NEVER say we've accepted or proven a null hypothesis. When we're doing hypothesis testing and we fail to reject, it could be because the null isn't true or it could also be because we don't have enough information in our sample to get an accurate measurement. So we simply say fail to reject, meaning that based on this data, we do not have conclusive evidence against the claim.

Question # 8. The market model A well-known model in nance assumes that the rate of return on a stock R i,t is linearly related to the rate of return on the overall stock market R M,t. Here, the sub-script i denotes the individual company and the subscript t denotes time. In practice, the market return R M,t is taken to be the rate of return on some major stock-market index (i.e. the S&P 500). The regression model is R it = α + βr M,t + ε i,t ε i,t N ( 0, σ 2) The slope coecient, β, called the beta of the stock, measures the sensitivity of the stock's rate of return to changes in the level of the overall market. For example, if β > 1 then the stock's rate of return is more sensitive to changes in the level of the overall market than the average stock. Conversely, if β < 1, then the stock's rate of return is less sensitive to market changes than the average stock. An alternative way to view the market model is as a model for the conditional mean of returns E (R i,t R M,t ) = α + βr M,t This says that we expect the average return on a stock to depend on the market as a whole. The data for this problem is in the le stockreturns.xlsx on the course homepage. This le contains the monthly returns on six dierent stocks as well as the S&P500 (GSPC) from February 1980 to March 2014. (a) Choose any three stocks and run a regression of the market model. What do you nd? Which company has the largest β? Here are the results for GE. This company had the highest beta.

(b) In Excel, a quick way to calculate the beta of a stock is to use the = SLOPE() function. This function reports the slope coecient in a simple linear regression model. Using this function, check the solutions for the slope coecients in part (a). Using the slope function, I estimated the same regression for GE using the =SLOPE function. As you can see, we get the same results.

Over long periods of time (e.g. 20 years), we might be concerned that a company's true value of β (exposure to market risk) changes over time. There are many reasons this could happen. Over time, companies' business strategies may change, product lines may change, and competition may change. There are a number of ways we could check to see if β is stable over time. The remainder of this problem will show you two ways. (c) Choose one of the companies that you think may have changed over time. (I chose IBM.) Split the entire sample of data into roughly two parts. For example, I selected 1980 to 1996 and then from 1997 to 2014. Run two separate regressions. One regression on the rst half of the sample (1980 to 1996) and another regression on the second half of the sample (1997 to 2014). What do you nd? Do the 95% condence intervals for the slope coecients overlap across the two samples? Here are the results for IBM for the rst and second half of the sample. We can see that the condence regions overlap slightly. This provides some evidence of parameter instability.

(d) An alternative to splitting the sample into two parts is what is a called rolling regression. In a rolling regression, we start with an initial portion of our data (say the rst T = 100 observations) and we run a regression. Then, we add the next observation to the data set but we throw out the initial observation (the total number of observations remains T = 100) and we run the regression again. A rolling regression repeats this procedure and rolls through time from the beginning to the end of the dataset. The idea is to see how stable the least squares slope coecients are through time. Here is a graph of the slope coecients in the rolling regression for IBM. Unfortunately, the slope function does not provide condence intervals. 1.6 1.5 1.4 1.3 1.2 1.1 1 0.9 0.8 0.7 0 50 100 150 200 250 300 350