Business Statistics 41000: Homework # 5
|
|
- Julianna Booker
- 5 years ago
- Views:
Transcription
1 Business Statistics 41000: Homework # 5 Drew Creal Due date: Beginning of class in week # 10 Remarks: These questions cover Lectures #7, 8, and 9. Question # 1. Condence intervals and plug-in predictive intervals There is concern about the speed of automobiles traveling over a particular stretch of highway. For a random sample of n = 7 automobiles, radar indicated the following speeds measured in miles per hour 79, 73, 68, 77, 86, 71, 69 Suppose we're willing to assume that the speed of each automobile is i.i.d. Normal. (a) Find the sample mean and sample standard deviation. (If you want just type the data into Excel, but try to do parts (b)-(c) by hand.) x = s x = 6.40 n = 7 (b) Using our usual x ± 2 S.E. formula, nd a 95% condence interval for the mean speed of all 1
2 automobiles travelling over this stretch of highway. The condence interval is: x ± 2S.E. = ± = ± = (69.88, 79.55) (c) Construct a 95% plug-in predictive interval for the speed of the next car that is clocked over this same stretch of highway. The predictive interval is: x ± 2s x = ± = ± 12.8 = (61.92, 87.51) (d) (OPTIONAL: It is easy.) Since this is a small sample, we should really use the Student's t distribution to make our intervals in parts (b) and (c). Recompute both intervals using the appropriate tval from Excel instead of 2. What happens to your intervals? The 95% condence interval is: x ± tval S.E. = ± = ± = (68.80, 80.63) The 95% predictive interval is: (59.06, 90.36)
3 tval = TINV(0.05,6) = 2.45 The width of both intervals has increased. (e) Suppose instead of seeing 7 cars, we had seen a sample of n = 70 cars with the same sample mean and sample variance as in part (a). How do your answers to (b) and (c) change? The 95% condence interval is: x ± 2S.E. = ± = ± = (73.19, 76.24) If the sample mean and sample standard deviation are the same, the 95% plug-in predictive interval would not change. (f) Now suppose I am no longer willing to assume that the speed of each car is normally distributed (I still think they are i.i.d). Are the intervals you constructed in (b) and (c) still valid? Does your answer depend on whether n = 7 or n = 70? This is a hard question because it requires you to think carefully about the assumptions we've made and the role they play in interpreting condence intervals and predictive intervals. I'll give you the full explanation. For the condence intervals, we generally don't need to assume the data are normally distributed. This is because of the Central Limit Theorem, which tells us that in a reasonably-sized sample, averages of i.i.d. random variables are normal. I've deliberately left open the question of what is a `reasonably - sized sample' (if you want, you can always fall back on most people's favorite n > 30 rule of thumb). But keep in mind, if we are willing to assume the data are normal it's
4 a moot point, because any linear function of normal random variables is normal (and the average is our favorite linear function.). In any case, though, 7 observations is NOT enough to apply the CLT. So, the basic point is: for n = 7 observations, our condence interval is valid only if we assume the data are normally distributed. For n = 70 observations, our condence interval is (approximately) correct regardless of whether the data is normal, due to the Central Limit Theorem. Now, what about the predictive interval? This is part of the reason I've emphasized in class that while the formulas for the condence interval and predictive interval look similar, the two are very dierent because they are asking dierent questions. In particular, the condence interval is telling you (approximately) how close our estimator x is to the actual unknown µ, while the predictive interval is asking us to give a range for a single random outcome. And that's exactly why when we're talking about predictive intervals, the CLT does not apply! The Central Limit Theorem is a statement about averages, and there's no averaging going on when we're trying to predict a single random outcome. Therefore, regardless of whether n = 7 or 70 (or 7 million), we need to assume the data are (at least approximately) normal for our plug-in predictive interval to be correct! Of course we can make predictions for non-normal data: the predictive interval can be generalized to other types of distributions (like for our bank arrival data). In any case, however, we need to assume a probability model for the data in order to make predictions!
5 Question # 2. Hypothesis tests and p-values In a random a sample of 160 business school graduates, 72 agreed with the statement, A reputation of ethical conduct is less important for a manager's chances of promotion than a reputation for making money for the rm. (a) Test the null hypothesis (at the 5% level) that the true proportion that would agree is.5. We have p 0 = 0.5, ˆp = 72/160 = 0.45, and n = 160. Therefore, the test statistic is: z = (1 0.5) 160 (b) Find the p-value for the test in (a). = 1.25 fail to reject at 5% level, since z < 2. = 2 (1 NORMDIST(ABS( 1.26), 0, 1, 1) = ) Again, we fail to rejectat the 5% level, as the p-value is larger than.05.
6 Question # 3. Condence intervals and p-values Suppose you are working for a certain candidate in an election. The candidate claims that 70% of voters support him. (a) You get a random sample of 20 voters. You get a sample proportion of ˆp = 0.5 voters that support the candidate. Test the null hypothesis that p = 0.7 (at the 5% level). Does it make sense to say you have proven or accepted the null hypothesis? We have p 0 = 0.7, ˆp =.5, and n = 20. This means the test statistic is: z = (1 0.7) 20 We fail to reject. = The 95% condence interval is: ˆp ± 2S.E. = 0.5 (1 0.5) 0.5 ± 2 20 = 0.5 ± = ( , ) Remember, we never accept the null hypothesis. Instead, we fail to reject it. (b) Now suppose you have a random sample of 10,000 voters and ˆp = Test the null hypothesis that p = 0.7 (at the 5% level). Compute the condence interval for p. Do you reject the null? Does it necessarily make sense to tell the candidate that he is wrong? The test statistic is z = so we would reject at level.05. The 95% condence interval is: ˆp ± 2S.E. = 0.71 (1 0.71) 0.71 ± = 0.71 ± = ( , )
7 So, does it necessarily make sense to tell the candidate that he is wrong? Probably not. Even though we formally reject the data tells us that as a practical matter the candidate is right, p is basically 0.7. Also, if anything, the actual value of p is higher than 0.7.
8 Question # 4. Condence intervals and hypothesis tests for IID normal data In class we focused on hypothesis tests for p with i.i.d. Bernoulli(p) data. Many of the hypothesis tests you will encounter are for µ with i.i.d. Normal data. Suppose we have a sample of observations X i and we assume that X 1, X 2,..., X n i.i.d. Normal(µ, σ 2 ). (a) Suppose we want to test the null hypothesis H 0 : µ = µ 0 where µ 0 is just some number, our guess for the unknown mean µ. Also suppose we know the variance σ 2. Under the null hypothesis, what is the distribution of the sample mean? Another way to ask this question is, what is the sampling distribution of the sample mean? Before we see the data, we are treating the observations in our sample X i as i.i.d. random variables. So the sample mean is also a random variable. On a previous homework, we worked out the mean and variance of a sample average: E [x] = E = 1 n = 1 n [ 1 n n i=1 ] n x i i=1 n µ i=1 E [x i ] = 1 n n µ = µ V [x] = V [ 1 n ] n x i i=1 = 1 n n 2 V [x i ] i=1 = 1 n 2 n i=1 σ 2 = 1 n 2 n σ2 = σ2 n
9 Now we know the mean and variance of x and we can apply the Central Limit Theorem, which tells us that averages are normally distributed when n is large. ( ) When n is large, the sample mean is distributed as x N µ, σ2 n. Now look at your answer to part (a). Suppose that we take as given that for large n, it doesn't matter that we don't know the variance σ 2 ; we can just replace it with the sample variance, s 2 X (just like we did for the condence interval). Under this assumption the test statistic z = X µ0 N(0, 1) S.E.(X) Why? If you got the right answer to part (a), you should see that we are just taking X, subtracting its mean, and dividing by its standard deviation (since dividing by S.E.(X) is just like dividing by σ n ). In other words, we are just standardizing X. So just like the notes, z is telling you, if our claim about µ is true, how many standard deviations is X away from its mean?. Recognize that this test statistic has the same distribution N(0, 1) as the Bernoulli example we used in Lecture # 8. We can do accept/reject decisions and compute p-values the SAME way!! (b) Lets look at the Canadian returns data again. Assuming the returns are i.i.d., write down a 95% condence interval for µ. Now test the null hypothesis that µ = 0.01 at the 5% level. Do you reject the null hypothesis? The 95% condence interval is x ± 2 S.E. = x ± 2 sx n or in this example: ± 2( ) = ( , ) The test statistic for our hypothesis test of H 0 : µ = 0.01 is x µ0 S.E.(x) or in this example = We fail to reject the null hypothesis. (c) Using the same data, test the following two null hypotheses at the 5% level: 1. H 0 : µ = 0 2. H 0 : µ = 0.018
10 For each one, do you reject or fail to reject? The appropriate z-values are: = = In both cases, we reject H 0. Stop and think about this for a minute. A condence interval and a hypothesis test are two ways of saying what we think about the true value of µ. Remember the formula for our 95% CI: X ± 2 S.E.(X) Now look at the formula for the test statistic: Z = X µ0 S.E.(X) (d) Look at the test statistic we used to test H 0 : µ = µ 0. The test statistic compares two values for µ: the value we `guessed', µ 0, and the value we actually estimated from our data, x. The dierence between the two values gets divided by S.E.(x). So this is like asking, how many standard errors apart are x and µ 0? If x and µ 0 are more than two standard errors apart, we will get a z-value greater than 2 and will reject H 0 at the 95% level. But now look at the formula for the condence interval again-the 95% CI contains all the values within two standard errors of x. If our `guess' µ 0 is outside the 95% CI, we will reject H 0!
11 Question # 5. Understanding the simple linear regression model Y i = X i + ε i ε i N (0, 9) (a) What is E(Y X = 0)? How about E(Y X = 1)? E(Y X = 0) = = 2 E(Y X = 1) = = 1.5 (b) What is V (Y X)? V (Y X) = V (ε X) = 9 (c) Compute a 95% prediction interval for Y given X = 1. The prediction interval uses the fact that the error term is normally distributed. Consequently, 95% of the probability is within two standard deviations of the mean. ( , ) = ( 4.5, 7.5) (d) Compute a 95% prediction interval for the average of two Y values (i.e., X 1 = 1 and X 2 = 1. Y 1 +Y 2 2 ) given that The average of two Y values is a linear combination of two normal random variables. In Lecture #5, we learned that linear combinations of normal random variables are normally distributed. We can use our formulas for linear combina-
12 tions to calculate the mean and variance of the average. ( ) Y1 + Y 2 E X 1 = 1, X 2 = 1 2 = 1 2 ( ) + 1 ( ) 2 = 1.5 V ( Y1 + Y 2 2 ) X 1 = 1, X 2 = 1 = ( ) = = 4.5 ( ) (e) What is Pr(Y < 8 X = 2)? Y N ( , 9) = N (5, 9) =
13 Question # 6. The simple linear regression model Suppose we are modeling house price as depending on house size. Price is measured in thousands of dollars and size is measured in thousands of square feet. Suppose our model is: P = x + ε, ε N(0, 15 2 ) (a) What are the units of the intercept α (the 20) and the standard deviation of the errors σ (the 15)? Thousands of dollars. The same units as P. (b) What are the units of the slope β (the 50)? Thousands of dollars per thousand square feet. (c) Suppose you know a particular house has size x 1 = 1.6. What is the conditional distribution of its price, P 1, given that size? Give a 95% predictive interval for the price of the house. The conditional distribution is: P (P 1 X 1 = 1.6) = N ( , (15) 2) = N (100, (15) 2) The 95% predictive interval is: ± 2 15 = 100 ± 30 = (70, 130) Now suppose another house has size x 2 = 2.2. What is the conditional distribution of its price, P 2, given that size? Give a 95% predictive interval for the price of the house.
14 The conditional distribution is: P (P 2 X 2 = 2.2) = N ( , (15) 2) = N (130, (15) 2) The 95% predictive interval is: ± 2 15 = 130 ± 30 = (100, 160) (d) Let's call the rst house in (c) house 1, with size x 1 = 1.6 and price P 1. The second house in (c) has size x 2 = 2.2 and price P 2. What is the distribution of the dierence in prices: P 2 P 1 given that we know the sizes are x 2 = 2.2 and x 1 = 1.6 as above. What assumptions are you making about the errors? The mean and variance of the price dierences are: E[P 2 P 1 ] = E[P 2 ] E[P 1 ] V [P 2 P 1 ] = V [P 2 ] + V [P 1 ] + 2 COV(P 1, P 2 ) Now, we assume that the errors are independent. This means COV(P 1, P 2 ) = 0. Therefore, using the results we just derived we know that P 2 P 1 N( , 450) N(30, 450)
15 Question # 7. The simple linear regression model For this question, use the beer data (beer.xls) we talked about earlier in Lecture # 2. We want to know how the number of beers a student claims to be able to drink is related to the student's body weight. Suppose we believe this relationship is B i = α + βw i + ε i ε i N(0, σ 2 ) i.i.d. Here B i is the number of beers the i-th student claims to be able to drink before becoming intoxicated and w i is the i-th student's body weight in pounds. (a) Make a scatter plot of nbeer versus weight. Does a linear relationship seem reasonable? Do you think the slope in this linear relationship is positive or negative? Beers Weight A linear relationship looks reasonable. We expect the slope to be positive. (b) Run this regression in Excel. Suppose we have a student who weighs 150 pounds. What is the 95% plug-in predictive interval for the number of beers this student will claim to be able to drink? Here is the output from Excel.
16 Results of simple regression for nbeer Summary measures Multiple R R-Square StErr of Est ANOVA table Source df SS MS F p-value Explained Unexplained Regression coefficients Coefficient Std Err t-value p-value Lower limit Upper limit Constant weight The 95% plug in predictive interval is a + b x ± 2 s e. a + b x ± 2 s e = ± = (1.39, 12.34) (c) Based on this data, do we believe there may be a signicant relationship between body weight and claims about drinking capacity? Explain. Yes. If we test the null hypothesis H 0 : β = 0 we get a p-value that is zero to at least four decimal places (Excel outputs this automatically). In other words, there is signicant evidence that claims about drinking ability are related to body weight. (d) Someone makes the statement, For each additional 10 pounds of body weight, a person will generally claim s/he can drink one additional beer before becoming intoxicated. Using a hypothesis test, does the evidence in the data support this claim? The hypothesis we're interested in here is H 0 : β = 1/10. (Since the slope is measured in beers per pound, this is one more beer per ten additional pounds of body weight.)
17 The test statistic is: t = b β0 s b = b 1 10 s b = = We fail to reject at the 5% level. This means that there is no conclusive evidence against this claim (though we still can't say for sure it is true!). (e) Insert a new worksheet and copy the rst four (nbeer, weight) data points over to the new worksheet. Run the regression again using only the rst four observations. Are the condence intervals for α and β wider or narrower than those we calculated using all the data? Why? See below. (f) Using the regression you ran in part (e), test the null hypothesis H 0 : β = 0. Does this prove that weight has no eect on claimed drinking capacity? nbeer weight Results of multiple regression for nbeer Summary measures Multiple R R-Square Adj R-Square StErr of Est ANOVA Table Source df SS MS F p-value Explained Unexplained Regression coefficients Coefficient Std Err t-value p-value Lower limit Upper limit Constant weight Note how much bigger the standard error s b is using only four data points (it was about before). This is because with only four data points, we are not able to measure the relationship between nbeer and weight very well.
18 We now fail to reject H 0 : β = 0 at the 5% level. However it is silly to say we have proven there is no relationship between beers and weight. It's simply that with only four data points we aren't measuring this relationship accurately. This is why we NEVER say we've accepted or proven a null hypothesis. When we're doing hypothesis testing and we fail to reject, it could be because the null isn't true or it could also be because we don't have enough information in our sample to get an accurate measurement. So we simply say fail to reject, meaning that based on this data, we do not have conclusive evidence against the claim.
19 Question # 8. The market model A well-known model in nance assumes that the rate of return on a stock R i,t is linearly related to the rate of return on the overall stock market R M,t. Here, the sub-script i denotes the individual company and the subscript t denotes time. In practice, the market return R M,t is taken to be the rate of return on some major stock-market index (i.e. the S&P 500). The regression model is R it = α + βr M,t + ε i,t ε i,t N ( 0, σ 2) The slope coecient, β, called the beta of the stock, measures the sensitivity of the stock's rate of return to changes in the level of the overall market. For example, if β > 1 then the stock's rate of return is more sensitive to changes in the level of the overall market than the average stock. Conversely, if β < 1, then the stock's rate of return is less sensitive to market changes than the average stock. An alternative way to view the market model is as a model for the conditional mean of returns E (R i,t R M,t ) = α + βr M,t This says that we expect the average return on a stock to depend on the market as a whole. The data for this problem is in the le stockreturns.xlsx on the course homepage. This le contains the monthly returns on six dierent stocks as well as the S&P500 (GSPC) from February 1980 to March (a) Choose any three stocks and run a regression of the market model. What do you nd? Which company has the largest β? Here are the results for GE. This company had the highest beta.
20 (b) In Excel, a quick way to calculate the beta of a stock is to use the = SLOPE() function. This function reports the slope coecient in a simple linear regression model. Using this function, check the solutions for the slope coecients in part (a). Using the slope function, I estimated the same regression for GE using the =SLOPE function. As you can see, we get the same results.
21 Over long periods of time (e.g. 20 years), we might be concerned that a company's true value of β (exposure to market risk) changes over time. There are many reasons this could happen. Over time, companies' business strategies may change, product lines may change, and competition may change. There are a number of ways we could check to see if β is stable over time. The remainder of this problem will show you two ways. (c) Choose one of the companies that you think may have changed over time. (I chose IBM.) Split the entire sample of data into roughly two parts. For example, I selected 1980 to 1996 and then from 1997 to Run two separate regressions. One regression on the rst half of the sample (1980 to 1996) and another regression on the second half of the sample (1997 to 2014). What do you nd? Do the 95% condence intervals for the slope coecients overlap across the two samples? Here are the results for IBM for the rst and second half of the sample. We can see that the condence regions overlap slightly. This provides some evidence of parameter instability.
22
23 (d) An alternative to splitting the sample into two parts is what is a called rolling regression. In a rolling regression, we start with an initial portion of our data (say the rst T = 100 observations) and we run a regression. Then, we add the next observation to the data set but we throw out the initial observation (the total number of observations remains T = 100) and we run the regression again. A rolling regression repeats this procedure and rolls through time from the beginning to the end of the dataset. The idea is to see how stable the least squares slope coecients are through time. Here is a graph of the slope coecients in the rolling regression for IBM. Unfortunately, the slope function does not provide condence intervals
Inference with Simple Regression
1 Introduction Inference with Simple Regression Alan B. Gelder 06E:071, The University of Iowa 1 Moving to infinite means: In this course we have seen one-mean problems, twomean problems, and problems
More informationFinal Exam Bus 320 Spring 2000 Russell
Name Final Exam Bus 320 Spring 2000 Russell Do not turn over this page until you are told to do so. You will have 3 hours minutes to complete this exam. The exam has a total of 100 points and is divided
More informationChapter 10. Correlation and Regression. McGraw-Hill, Bluman, 7th ed., Chapter 10 1
Chapter 10 Correlation and Regression McGraw-Hill, Bluman, 7th ed., Chapter 10 1 Chapter 10 Overview Introduction 10-1 Scatter Plots and Correlation 10- Regression 10-3 Coefficient of Determination and
More informationBusiness 320, Fall 1999, Final
Business 320, Fall 1999, Final name You may use a calculator and two cheat sheets. You have 3 hours. I pledge my honor that I have not violated the Honor Code during this examination. Obvioiusly, you may
More informationMidterm 2 - Solutions
Ecn 102 - Analysis of Economic Data University of California - Davis February 24, 2010 Instructor: John Parman Midterm 2 - Solutions You have until 10:20am to complete this exam. Please remember to put
More informationvalue of the sum standard units
Stat 1001 Winter 1998 Geyer Homework 7 Problem 18.1 20 and 25. Problem 18.2 (a) Average of the box. (1+3+5+7)=4=4. SD of the box. The deviations from the average are,3,,1, 1, 3. The squared deviations
More informationRegression Models. Chapter 4. Introduction. Introduction. Introduction
Chapter 4 Regression Models Quantitative Analysis for Management, Tenth Edition, by Render, Stair, and Hanna 008 Prentice-Hall, Inc. Introduction Regression analysis is a very valuable tool for a manager
More informationInferences for Regression
Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In
More informationHypothesis testing I. - In particular, we are talking about statistical hypotheses. [get everyone s finger length!] n =
Hypothesis testing I I. What is hypothesis testing? [Note we re temporarily bouncing around in the book a lot! Things will settle down again in a week or so] - Exactly what it says. We develop a hypothesis,
More informationChapter 3 Multiple Regression Complete Example
Department of Quantitative Methods & Information Systems ECON 504 Chapter 3 Multiple Regression Complete Example Spring 2013 Dr. Mohammad Zainal Review Goals After completing this lecture, you should be
More informationt-test for b Copyright 2000 Tom Malloy. All rights reserved. Regression
t-test for b Copyright 2000 Tom Malloy. All rights reserved. Regression Recall, back some time ago, we used a descriptive statistic which allowed us to draw the best fit line through a scatter plot. We
More informationSection 2: Estimation, Confidence Intervals and Testing Hypothesis
Section 2: Estimation, Confidence Intervals and Testing Hypothesis Carlos M. Carvalho The University of Texas at Austin McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/
More informationMathematical Notation Math Introduction to Applied Statistics
Mathematical Notation Math 113 - Introduction to Applied Statistics Name : Use Word or WordPerfect to recreate the following documents. Each article is worth 10 points and should be emailed to the instructor
More informationApplied Regression Analysis. Section 2: Multiple Linear Regression
Applied Regression Analysis Section 2: Multiple Linear Regression 1 The Multiple Regression Model Many problems involve more than one independent variable or factor which affects the dependent or response
More informationLecture 14. Analysis of Variance * Correlation and Regression. The McGraw-Hill Companies, Inc., 2000
Lecture 14 Analysis of Variance * Correlation and Regression Outline Analysis of Variance (ANOVA) 11-1 Introduction 11-2 Scatter Plots 11-3 Correlation 11-4 Regression Outline 11-5 Coefficient of Determination
More informationLecture 14. Outline. Outline. Analysis of Variance * Correlation and Regression Analysis of Variance (ANOVA)
Outline Lecture 14 Analysis of Variance * Correlation and Regression Analysis of Variance (ANOVA) 11-1 Introduction 11- Scatter Plots 11-3 Correlation 11-4 Regression Outline 11-5 Coefficient of Determination
More informationSection 4: Multiple Linear Regression
Section 4: Multiple Linear Regression Carlos M. Carvalho The University of Texas at Austin McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 The Multiple Regression
More informationRegression, part II. I. What does it all mean? A) Notice that so far all we ve done is math.
Regression, part II I. What does it all mean? A) Notice that so far all we ve done is math. 1) One can calculate the Least Squares Regression Line for anything, regardless of any assumptions. 2) But, if
More informationBusiness Statistics. Lecture 10: Course Review
Business Statistics Lecture 10: Course Review 1 Descriptive Statistics for Continuous Data Numerical Summaries Location: mean, median Spread or variability: variance, standard deviation, range, percentiles,
More informationStatistical Inference
Statistical Inference Bernhard Klingenberg Institute of Statistics Graz University of Technology Steyrergasse 17/IV, 8010 Graz www.statistics.tugraz.at February 12, 2008 Outline Estimation: Review of concepts
More informationInteractions. Interactions. Lectures 1 & 2. Linear Relationships. y = a + bx. Slope. Intercept
Interactions Lectures 1 & Regression Sometimes two variables appear related: > smoking and lung cancers > height and weight > years of education and income > engine size and gas mileage > GMAT scores and
More informationMITOCW ocw f99-lec09_300k
MITOCW ocw-18.06-f99-lec09_300k OK, this is linear algebra lecture nine. And this is a key lecture, this is where we get these ideas of linear independence, when a bunch of vectors are independent -- or
More informationSimple linear regression
Simple linear regression Business Statistics 41000 Fall 2015 1 Topics 1. conditional distributions, squared error, means and variances 2. linear prediction 3. signal + noise and R 2 goodness of fit 4.
More informationMITOCW ocw f99-lec30_300k
MITOCW ocw-18.06-f99-lec30_300k OK, this is the lecture on linear transformations. Actually, linear algebra courses used to begin with this lecture, so you could say I'm beginning this course again by
More informationChapter 16. Simple Linear Regression and Correlation
Chapter 16 Simple Linear Regression and Correlation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will
More informationHomework 1 Solutions
Homework 1 Solutions January 18, 2012 Contents 1 Normal Probability Calculations 2 2 Stereo System (SLR) 2 3 Match Histograms 3 4 Match Scatter Plots 4 5 Housing (SLR) 4 6 Shock Absorber (SLR) 5 7 Participation
More informationChapter 4: Regression Models
Sales volume of company 1 Textbook: pp. 129-164 Chapter 4: Regression Models Money spent on advertising 2 Learning Objectives After completing this chapter, students will be able to: Identify variables,
More informationSection 3: Simple Linear Regression
Section 3: Simple Linear Regression Carlos M. Carvalho The University of Texas at Austin McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Regression: General Introduction
More informationECON 497 Midterm Spring
ECON 497 Midterm Spring 2009 1 ECON 497: Economic Research and Forecasting Name: Spring 2009 Bellas Midterm You have three hours and twenty minutes to complete this exam. Answer all questions and explain
More informationSection 5.6 from Basic Mathematics Review by Oka Kurniawan was developed by OpenStax College, licensed by Rice University, and is available on the
Section 5.6 from Basic Mathematics Review by Oka Kurniawan was developed by OpenStax College, licensed by Rice University, and is available on the Connexions website. It is used under a Creative Commons
More informationThe Purpose of Hypothesis Testing
Section 8 1A:! An Introduction to Hypothesis Testing The Purpose of Hypothesis Testing See s Candy states that a box of it s candy weighs 16 oz. They do not mean that every single box weights exactly 16
More informationChapter 16. Simple Linear Regression and dcorrelation
Chapter 16 Simple Linear Regression and dcorrelation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will
More informationThe Inductive Proof Template
CS103 Handout 24 Winter 2016 February 5, 2016 Guide to Inductive Proofs Induction gives a new way to prove results about natural numbers and discrete structures like games, puzzles, and graphs. All of
More informationECO220Y Review and Introduction to Hypothesis Testing Readings: Chapter 12
ECO220Y Review and Introduction to Hypothesis Testing Readings: Chapter 12 Winter 2012 Lecture 13 (Winter 2011) Estimation Lecture 13 1 / 33 Review of Main Concepts Sampling Distribution of Sample Mean
More informationwhere Female = 0 for males, = 1 for females Age is measured in years (22, 23, ) GPA is measured in units on a four-point scale (0, 1.22, 3.45, etc.
Notes on regression analysis 1. Basics in regression analysis key concepts (actual implementation is more complicated) A. Collect data B. Plot data on graph, draw a line through the middle of the scatter
More information1: a b c d e 2: a b c d e 3: a b c d e 4: a b c d e 5: a b c d e. 6: a b c d e 7: a b c d e 8: a b c d e 9: a b c d e 10: a b c d e
Economics 102: Analysis of Economic Data Cameron Spring 2016 Department of Economics, U.C.-Davis Final Exam (A) Tuesday June 7 Compulsory. Closed book. Total of 58 points and worth 45% of course grade.
More informationCLASS NOTES: BUSINESS CALCULUS
CLASS NOTES: BUSINESS CALCULUS These notes can be thought of as the logical skeleton of my lectures, although they will generally contain a fuller exposition of concepts but fewer examples than my lectures.
More informationChapter 4. Regression Models. Learning Objectives
Chapter 4 Regression Models To accompany Quantitative Analysis for Management, Eleventh Edition, by Render, Stair, and Hanna Power Point slides created by Brian Peterson Learning Objectives After completing
More informationA discussion on multiple regression models
A discussion on multiple regression models In our previous discussion of simple linear regression, we focused on a model in which one independent or explanatory variable X was used to predict the value
More informationCENTRAL LIMIT THEOREM (CLT)
CENTRAL LIMIT THEOREM (CLT) A sampling distribution is the probability distribution of the sample statistic that is formed when samples of size n are repeatedly taken from a population. If the sample statistic
More informationBusiness Statistics 41000: Homework # 2 Solutions
Business Statistics 4000: Homework # 2 Solutions Drew Creal February 9, 204 Question #. Discrete Random Variables and Their Distributions (a) The probabilities have to sum to, which means that 0. + 0.3
More informationEconometrics Homework 1
Econometrics Homework Due Date: March, 24. by This problem set includes questions for Lecture -4 covered before midterm exam. Question Let z be a random column vector of size 3 : z = @ (a) Write out z
More informationMITOCW ocw f99-lec23_300k
MITOCW ocw-18.06-f99-lec23_300k -- and lift-off on differential equations. So, this section is about how to solve a system of first order, first derivative, constant coefficient linear equations. And if
More informationMULTIPLE REGRESSION ANALYSIS AND OTHER ISSUES. Business Statistics
MULTIPLE REGRESSION ANALYSIS AND OTHER ISSUES Business Statistics CONTENTS Multiple regression Dummy regressors Assumptions of regression analysis Predicting with regression analysis Old exam question
More informationLECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit
LECTURE 6 Introduction to Econometrics Hypothesis testing & Goodness of fit October 25, 2016 1 / 23 ON TODAY S LECTURE We will explain how multiple hypotheses are tested in a regression model We will define
More informationNote: Please use the actual date you accessed this material in your citation.
MIT OpenCourseWare http://ocw.mit.edu 18.06 Linear Algebra, Spring 2005 Please use the following citation format: Gilbert Strang, 18.06 Linear Algebra, Spring 2005. (Massachusetts Institute of Technology:
More informationContingency Tables. Contingency tables are used when we want to looking at two (or more) factors. Each factor might have two more or levels.
Contingency Tables Definition & Examples. Contingency tables are used when we want to looking at two (or more) factors. Each factor might have two more or levels. (Using more than two factors gets complicated,
More informationKeller: Stats for Mgmt & Econ, 7th Ed July 17, 2006
Chapter 17 Simple Linear Regression and Correlation 17.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will
More informationBusiness Statistics: Lecture 8: Introduction to Estimation & Hypothesis Testing
Business Statistics: Lecture 8: Introduction to Estimation & Hypothesis Testing Agenda Introduction to Estimation Point estimation Interval estimation Introduction to Hypothesis Testing Concepts en terminology
More informationReview of Multiple Regression
Ronald H. Heck 1 Let s begin with a little review of multiple regression this week. Linear models [e.g., correlation, t-tests, analysis of variance (ANOVA), multiple regression, path analysis, multivariate
More informationPart Possible Score Base 5 5 MC Total 50
Stat 220 Final Exam December 16, 2004 Schafer NAME: ANDREW ID: Read This First: You have three hours to work on the exam. The other questions require you to work out answers to the questions; be sure to
More informationData Analysis and Statistical Methods Statistics 651
Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Lecture 31 (MWF) Review of test for independence and starting with linear regression Suhasini Subba
More informationThis gives us an upper and lower bound that capture our population mean.
Confidence Intervals Critical Values Practice Problems 1 Estimation 1.1 Confidence Intervals Definition 1.1 Margin of error. The margin of error of a distribution is the amount of error we predict when
More information1 A Non-technical Introduction to Regression
1 A Non-technical Introduction to Regression Chapters 1 and Chapter 2 of the textbook are reviews of material you should know from your previous study (e.g. in your second year course). They cover, in
More informationBasic Business Statistics, 10/e
Chapter 4 4- Basic Business Statistics th Edition Chapter 4 Introduction to Multiple Regression Basic Business Statistics, e 9 Prentice-Hall, Inc. Chap 4- Learning Objectives In this chapter, you learn:
More informationMath 101: Elementary Statistics Tests of Hypothesis
Tests of Hypothesis Department of Mathematics and Computer Science University of the Philippines Baguio November 15, 2018 Basic Concepts of Statistical Hypothesis Testing A statistical hypothesis is an
More informationLecture 4 Scatterplots, Association, and Correlation
Lecture 4 Scatterplots, Association, and Correlation Previously, we looked at Single variables on their own One or more categorical variable In this lecture: We shall look at two quantitative variables.
More informationPHYSICS 107. Lecture 3 Numbers and Units
Numbers in Physics PHYSICS 107 Lecture 3 Numbers and Units We've seen already that even 2500 years ago Aristotle recognized that lengths and times are magnitudes, meaning that any length or time can be
More informationLecture 4 Scatterplots, Association, and Correlation
Lecture 4 Scatterplots, Association, and Correlation Previously, we looked at Single variables on their own One or more categorical variables In this lecture: We shall look at two quantitative variables.
More informationSIMPLE REGRESSION ANALYSIS. Business Statistics
SIMPLE REGRESSION ANALYSIS Business Statistics CONTENTS Ordinary least squares (recap for some) Statistical formulation of the regression model Assessing the regression model Testing the regression coefficients
More informationBasic Business Statistics 6 th Edition
Basic Business Statistics 6 th Edition Chapter 12 Simple Linear Regression Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of a dependent variable based
More informationLECTURE 5. Introduction to Econometrics. Hypothesis testing
LECTURE 5 Introduction to Econometrics Hypothesis testing October 18, 2016 1 / 26 ON TODAY S LECTURE We are going to discuss how hypotheses about coefficients can be tested in regression models We will
More informationChapter 14 Student Lecture Notes 14-1
Chapter 14 Student Lecture Notes 14-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter 14 Multiple Regression Analysis and Model Building Chap 14-1 Chapter Goals After completing this
More informationMITOCW MITRES18_005S10_DiffEqnsGrowth_300k_512kb-mp4
MITOCW MITRES18_005S10_DiffEqnsGrowth_300k_512kb-mp4 GILBERT STRANG: OK, today is about differential equations. That's where calculus really is applied. And these will be equations that describe growth.
More informationRegression Analysis and Forecasting Prof. Shalabh Department of Mathematics and Statistics Indian Institute of Technology-Kanpur
Regression Analysis and Forecasting Prof. Shalabh Department of Mathematics and Statistics Indian Institute of Technology-Kanpur Lecture 10 Software Implementation in Simple Linear Regression Model using
More informationdetermine whether or not this relationship is.
Section 9-1 Correlation A correlation is a between two. The data can be represented by ordered pairs (x,y) where x is the (or ) variable and y is the (or ) variable. There are several types of correlations
More information1 Correlation and Inference from Regression
1 Correlation and Inference from Regression Reading: Kennedy (1998) A Guide to Econometrics, Chapters 4 and 6 Maddala, G.S. (1992) Introduction to Econometrics p. 170-177 Moore and McCabe, chapter 12 is
More informationStatistics and Quantitative Analysis U4320
Statistics and Quantitative Analysis U3 Lecture 13: Explaining Variation Prof. Sharyn O Halloran Explaining Variation: Adjusted R (cont) Definition of Adjusted R So we'd like a measure like R, but one
More informationMITOCW ocw f99-lec17_300k
MITOCW ocw-18.06-f99-lec17_300k OK, here's the last lecture in the chapter on orthogonality. So we met orthogonal vectors, two vectors, we met orthogonal subspaces, like the row space and null space. Now
More informationStatistics for IT Managers
Statistics for IT Managers 95-796, Fall 2012 Module 2: Hypothesis Testing and Statistical Inference (5 lectures) Reading: Statistics for Business and Economics, Ch. 5-7 Confidence intervals Given the sample
More informationCh. 16: Correlation and Regression
Ch. 1: Correlation and Regression With the shift to correlational analyses, we change the very nature of the question we are asking of our data. Heretofore, we were asking if a difference was likely to
More informationMathematics Level D: Lesson 2 Representations of a Line
Mathematics Level D: Lesson 2 Representations of a Line Targeted Student Outcomes Students graph a line specified by a linear function. Students graph a line specified by an initial value and rate of change
More informationExpectations and Variance
4. Model parameters and their estimates 4.1 Expected Value and Conditional Expected Value 4. The Variance 4.3 Population vs Sample Quantities 4.4 Mean and Variance of a Linear Combination 4.5 The Covariance
More informationStatistical Foundations:
Statistical Foundations: t distributions, t-tests tests Psychology 790 Lecture #12 10/03/2006 Today sclass The t-distribution t ib ti in its full glory. Why we use it for nearly everything. Confidence
More informationProbability and Statistics
Probability and Statistics Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be CHAPTER 4: IT IS ALL ABOUT DATA 4a - 1 CHAPTER 4: IT
More informationCorrelation & Simple Regression
Chapter 11 Correlation & Simple Regression The previous chapter dealt with inference for two categorical variables. In this chapter, we would like to examine the relationship between two quantitative variables.
More informationAlgebra 1 Fall Semester Final Review Name
It is very important that you review for the Algebra Final. Here are a few pieces of information you want to know. Your Final is worth 20% of your overall grade The final covers concepts from the entire
More informationBusiness Statistics. Lecture 9: Simple Regression
Business Statistics Lecture 9: Simple Regression 1 On to Model Building! Up to now, class was about descriptive and inferential statistics Numerical and graphical summaries of data Confidence intervals
More informationCorrelation Coefficient: the quantity, measures the strength and direction of a linear relationship between 2 variables.
AFM Unit 9 Regression Day 1 notes A mathematical model is an equation that best describes a particular set of paired data. These mathematical models are referred to as models and are used to one variable
More informationA particularly nasty aspect of this is that it is often difficult or impossible to tell if a model fails to satisfy these steps.
ECON 497: Lecture 6 Page 1 of 1 Metropolitan State University ECON 497: Research and Forecasting Lecture Notes 6 Specification: Choosing the Independent Variables Studenmund Chapter 6 Before we start,
More informationappstats27.notebook April 06, 2017
Chapter 27 Objective Students will conduct inference on regression and analyze data to write a conclusion. Inferences for Regression An Example: Body Fat and Waist Size pg 634 Our chapter example revolves
More informationAddition and Subtraction of real numbers (1.3 & 1.4)
Math 051 lecture notes Professor Jason Samuels Addition and Subtraction of real numbers (1.3 & 1.4) ex) 3 + 5 = ex) 42 + 29 = ex) 12-4 = ex) 7-9 = ex) -3-4 = ex) 6 - (-2) = ex) -5 - (-3) = ex) 7 + (-2)
More informationExample 1: Dear Abby. Stat Camp for the MBA Program
Stat Camp for the MBA Program Daniel Solow Lecture 4 The Normal Distribution and the Central Limit Theorem 187 Example 1: Dear Abby You wrote that a woman is pregnant for 266 days. Who said so? I carried
More informationINTRODUCTION TO ANALYSIS OF VARIANCE
CHAPTER 22 INTRODUCTION TO ANALYSIS OF VARIANCE Chapter 18 on inferences about population means illustrated two hypothesis testing situations: for one population mean and for the difference between two
More informationChapter 26: Comparing Counts (Chi Square)
Chapter 6: Comparing Counts (Chi Square) We ve seen that you can turn a qualitative variable into a quantitative one (by counting the number of successes and failures), but that s a compromise it forces
More informationExplanation of R 2, and Other Stories
Explanation of R 2, and Other Stories I accidentally gave an incorrect formula for R 2 in class. This summary was initially just going to be me correcting my error, but I've taken the opportunity to clarify
More informationSTA441: Spring Multiple Regression. This slide show is a free open source document. See the last slide for copyright information.
STA441: Spring 2018 Multiple Regression This slide show is a free open source document. See the last slide for copyright information. 1 Least Squares Plane 2 Statistical MODEL There are p-1 explanatory
More informationy n 1 ( x i x )( y y i n 1 i y 2
STP3 Brief Class Notes Instructor: Ela Jackiewicz Chapter Regression and Correlation In this chapter we will explore the relationship between two quantitative variables, X an Y. We will consider n ordered
More informationMITOCW ocw f99-lec16_300k
MITOCW ocw-18.06-f99-lec16_300k OK. Here's lecture sixteen and if you remember I ended up the last lecture with this formula for what I called a projection matrix. And maybe I could just recap for a minute
More informationInference for Regression Simple Linear Regression
Inference for Regression Simple Linear Regression IPS Chapter 10.1 2009 W.H. Freeman and Company Objectives (IPS Chapter 10.1) Simple linear regression p Statistical model for linear regression p Estimating
More informationIn the previous chapter, we learned how to use the method of least-squares
03-Kahane-45364.qxd 11/9/2007 4:40 PM Page 37 3 Model Performance and Evaluation In the previous chapter, we learned how to use the method of least-squares to find a line that best fits a scatter of points.
More informationEcon 325: Introduction to Empirical Economics
Econ 325: Introduction to Empirical Economics Chapter 9 Hypothesis Testing: Single Population Ch. 9-1 9.1 What is a Hypothesis? A hypothesis is a claim (assumption) about a population parameter: population
More informationMidterm 2 - Solutions
Ecn 102 - Analysis of Economic Data University of California - Davis February 23, 2010 Instructor: John Parman Midterm 2 - Solutions You have until 10:20am to complete this exam. Please remember to put
More informationTHE ROYAL STATISTICAL SOCIETY 2008 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE (MODULAR FORMAT) MODULE 4 LINEAR MODELS
THE ROYAL STATISTICAL SOCIETY 008 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE (MODULAR FORMAT) MODULE 4 LINEAR MODELS The Society provides these solutions to assist candidates preparing for the examinations
More informationGraphing Equations Chapter Test
1. Which line on the graph has a slope of 2/3? Graphing Equations Chapter Test A. Line A B. Line B C. Line C D. Line D 2. Which equation is represented on the graph? A. y = 4x 6 B. y = -4x 6 C. y = 4x
More information79 Wyner Math Academy I Spring 2016
79 Wyner Math Academy I Spring 2016 CHAPTER NINE: HYPOTHESIS TESTING Review May 11 Test May 17 Research requires an understanding of underlying mathematical distributions as well as of the research methods
More informationCh 3: Multiple Linear Regression
Ch 3: Multiple Linear Regression 1. Multiple Linear Regression Model Multiple regression model has more than one regressor. For example, we have one response variable and two regressor variables: 1. delivery
More informationBinary Logistic Regression
The coefficients of the multiple regression model are estimated using sample data with k independent variables Estimated (or predicted) value of Y Estimated intercept Estimated slope coefficients Ŷ = b
More informationProbability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur
Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur Lecture No. # 36 Sampling Distribution and Parameter Estimation
More informationRegression, Part I. - In correlation, it would be irrelevant if we changed the axes on our graph.
Regression, Part I I. Difference from correlation. II. Basic idea: A) Correlation describes the relationship between two variables, where neither is independent or a predictor. - In correlation, it would
More information