More Statistical Inference

Size: px
Start display at page:

Download "More Statistical Inference"

Transcription

1 4 PART FOUR

2 More Statistical Inference CHAPTER 9 Correlation and Regression CHAPTER 10 Chi-Square Tests and the F-Distribution CHAPTER 11 Nonparametric Tests 493

3 C H A P T E R 9 Correlation and Regression 9.1 Correlation ACTIVITY 9.2 Linear Regression ACTIVITY CASE STUDY 9.3 Measures of Regression and Prediction Intervals 9.4 Multiple Regression USES AND ABUSES REAL STATISTICS REAL DECISIONS TECHNOLOGY In 2006, the total bo office gross was about 9.3 billion dollars with more than 1.4 billion tickets sold. Two of the four distributors who surpassed the $1 billion mark in overall gross were 20th Centur Fo and Warner Brothers.

4 WHERE YOU VE BEEN In Chapters 1 8, ou studied descriptive statistics, probabilit, and inferential statistics. One of the techniques ou learned in descriptive statistics was graphing paired data with a scatter plot (Section 2.2). For instance, the budgets and worldwide grosses of 25 of the most epensive movies distributed b 20th Centur Fo are shown in tabular form at the right and in graphical form below. Worldwide gross (in millions of dollars) th Centur Fo Budget (in millions of dollars) WHERE YOU RE GOING 25 of the Most Epensive 20th Centur Fo Movies Budget (in millions of dollars) Worldwide Gross (in millions of dollars) In this chapter, ou will stud how to describe and test the significance of relationships between two variables when data are presented as ordered pairs. For instance, in the scatter plot for the 20th Centur Fo movies, it appears that higher grosses tend to correspond to higher budgets and lower grosses tend to correspond to lower budgets. This relationship is described b saing that the worldwide grosses are positivel correlated to the movies budgets. Graphicall, the relationship can be described b drawing a line, called a regression line, that fits the points as closel as possible. The second scatter plot below shows 25 of the most epensive movies distributed b Warner Brothers. From the scatter plot, it appears that there is no linear correlation between the worldwide grosses and the movies budgets. 20 th Centur Fo Warner Brothers Worldwide gross (in millions of dollars) Budget (in millions of dollars) Worldwide gross (in millions of dollars) Budget (in millions of dollars) 495

5 496 C H A P T E R 9 CORRELATION AND REGRESSION 9.1 Correlation What You SHOULD LEARN An introduction to linear correlation, independent and dependent variables, and the tpes of correlation How to find a correlation coefficient How to test a population correlation coefficient r using a table How to perform a hpothesis test for a population correlation coefficient r How to distinguish between correlation and causation An Overview of Correlation Correlation Coefficient Using a Table to Test a Population Correlation Coefficient r Hpothesis Testing for a Population Correlation Coefficient r Correlation and Causation An Overview of Correlation Suppose a safet inspector wants to determine whether a relationship eists between the number of hours of training for an emploee and the number of accidents involving that emploee. Or suppose a pschologist wants to know whether a relationship eists between the number of hours a person sleeps each night and that person s reaction time. How would he or she determine if an relationship eists? In this section, ou will stud how to describe what tpe of relationship, or correlation, eists between two quantitative variables and how to determine whether the correlation is significant. DEFINITION A correlation is a relationship between two variables. The data can be represented b the ordered pairs 1, 2 where is the independent (or eplanator) variable and is the dependent (or response) variable. In Section 2.2, ou learned that the graph of ordered pairs 1, 2 is called a scatter plot. In a scatter plot, the ordered pairs 1, 2 are graphed as points in a coordinate plane. The independent (eplanator) variable is measured b the horizontal ais, and the dependent (response) variable is measured b the vertical ais. A scatter plot can be used to determine whether a linear (straight line) correlation eists between two variables. The following scatter plots show several tpes of correlation. As increases, tends to decrease. Negative Linear Correlation As increases, tends to increase. Positive Linear Correlation No Correlation Nonlinear Correlation

6 SECTION 9.1 CORRELATION 497 Advertising epenses (1000s of $), Compan sales (1000s of $), Income level (1000s of $), Donating percent, EXAMPLE 1 Constructing a Scatter Plot A marketing manager conducted a stud to determine whether there is a linear relationship between mone spent on advertising and compan sales. The data are shown in the table at the left. Displa the data in a scatter plot and determine whether there appears to be a positive or negative linear correlation or no linear correlation. Solution The scatter plot is shown at the right. From the scatter plot, it appears that there is a positive linear correlation between the variables. Interpretation Reading from left to right, as the advertising epenses increase, the sales tend to increase. Tr It Yourself 1 Compan sales (in thousands of dollars) A sociologist conducted a stud to determine whether there is a linear relationship between famil income level and percent of income donated to charities. The data are shown in the table at the left. Displa the data in a scatter plot and determine the tpe of correlation. a. Draw and label the - and -aes. b. Plot each ordered pair. c. Does there appear to be a linear correlation? If so, interpret the correlation in the contet of the data. Answer: Page A Advertising epenses (in thousands of dollars) EXAMPLE 2 Constructing a Scatter Plot A student nurse conducts a stud to determine whether there is a linear relationship between an individual s weight (in pounds) and dail water consumption (in ounces). The data are shown in the following table. Organize the data in a scatter plot and describe the tpe of correlation. Weight, Water, Solution The scatter plot is shown at the right. From the scatter plot, it appears that there is no linear correlation between the variables. Interpretation A person s weight does not appear to be related to the amount of water that person consumes. Water consumption (in ounces) Weight (in pounds)

7 498 C H A P T E R 9 CORRELATION AND REGRESSION Stud Tip Save an data put into a technolog tool because these data will be used throughout the chapter. Tr It Yourself 2 A marketing manager conducts a stud to determine whether there is a linear relationship between a person s age and the number of magazines to which that person subscribes. The data are shown in the following table. Displa the data in a scatter plot and determine the tpe of correlation. Age, Subscriptions, a. Draw and label the - and -aes. b. Plot each ordered pair. c. Does there appear to be a linear correlation? If so, interpret the correlation in the contet of the data. Answer: Page A46 Duration, Time, Duration, Time, EXAMPLE 3 Constructing a Scatter Plot Using Technolog Old Faithful, located in Yellowstone National Park, is the world s most famous geser. The duration (in minutes) of several of Old Faithful s eruptions and the times (in minutes) until the net eruption are shown in the table at the left. Using a TI-83/84, displa the data in a scatter plot. Determine the tpe of correlation. Solution Begin b entering the -values into List 1 and the -values into List 2. Use Stat Plot to construct the scatter plot. The plot should look similar to the one shown below. From the scatter plot, it appears that the variables have a positive linear correlation Stud Tip You can also use MINITAB and Ecel to construct scatter plots. Interpretation You can conclude that the longer the duration of the eruption, the longer the time before the net eruption begins. Tr It Yourself 3 Consider the data from the Chapter Opener on page 495 on 25 of the most epensive 20th Centur Fo movies. Use a technolog tool to displa the data in a scatter plot. Determine the tpe of correlation. a. Enter the data into List 1 and List 2. b. Construct the scatter plot. c. Does there appear to be a linear correlation? If so, interpret the correlation in the contet of the data. Answer: Page A46

8 SECTION 9.1 CORRELATION 499 Correlation Coefficient Interpreting correlation using a scatter plot can be subjective. A more precise wa to measure the tpe and strength of a linear correlation between two variables is to calculate the correlation coefficient. Although a formula for the sample correlation coefficient is given, it is more convenient to use a technolog tool to calculate this value. DEFINITION Insight The formal name for r is the Pearson product moment correlation coefficient. It is named after the English statistician Karl Pearson ( ). (See page 35.) The correlation coefficient is a measure of the strength and the direction of a linear relationship between two variables. The smbol r represents the sample correlation coefficient. A formula for r is r = ng - 1g 21g 2 2ng 2-1g 2 2 2ng 2-1g 2 2 where n is the number of pairs of data. The population correlation coefficient is represented b r (the lowercase Greek letter rho, pronounced row ). The range of the correlation coefficient is -1 to 1. If and have a strong positive linear correlation, r is close to 1. If and have a strong negative linear correlation, r is close to -1. If there is no linear correlation or a weak linear correlation, r is close to 0. It is important to remember that if r is close to 0, it does not mean that there is no relation between and, just that there is no linear relation. Several eamples are shown below. Shoe size Height (in inches) Strong positive correlation r = 0.81 Test grade Number of absences Strong negative correlation r = Amount spent on milk per ear (in dollars) Income per ear (in thousands of dollars) Weak positive correlation r = 0.45 Height (in inches) I.Q. score No correlation r = 0.04 Stud Tip g 2 means square each value and add the squares. 1g 2 2 means add the values and square the sum. G GUIDELINES U I D E L I N E S Calculating a Correlation Coefficient In Words 1. Find the sum of the -values. 2. Find the sum of the -values. 3. Multipl each -value b its corresponding -value and find the sum. 4. Square each -value and find the sum. 5. Square each -value and find the sum. 6. Use these five sums to calculate the correlation coefficient. r = In Smbols g g g g 2 g 2 ng - 1g 21g 2 2ng 2-1g2 2 2ng 2-1g2 2

9 500 C H A P T E R 9 CORRELATION AND REGRESSION EXAMPLE 4 Finding the Correlation Coefficient Calculate the correlation coefficient for the advertising ependitures and compan sales data given in Eample 1. What can ou conclude? Solution Use a table to help calculate the correlation coefficient. Advertising epenses (1000s of $), Compan sales (1000s of $), , , , , , , , ,225 g = 15.8 g = 1634 g = g 2 = g 2 = 337,558 Stud Tip Notice that the correlation coefficient r in Eample 4 is rounded to three decimal places. This round-off rule will be used throughout the tet. With these sums and n = 8, the correlation coefficient is r = = = ng - 1g 21g 2 2ng 2-1g 2 2 2ng 2-1g , L ,508 The result r L suggests a strong positive linear correlation. Interpretation As the amount spent on advertising increases, the compan sales also increase. Tr It Yourself 4 Calculate the correlation coefficient for the income level and donating percent data given in Tr It Yourself 1. What can ou conclude? Income level (1000s of $), Donating percent, a. Identif n. b. Use a table to calculate g, g, g, g 2, and g 2. c. Use the resulting sums and n to calculate r. d. What can ou conclude? Answer: Page A46

10 SECTION 9.1 CORRELATION 501 EXAMPLE 5 Using Technolog to Find a Correlation Coefficient Use a technolog tool to calculate the correlation coefficient for the Old Faithful data given in Eample 3. What can ou conclude? Solution MINITAB, Ecel, and the TI-83/84 each have features that allow ou to calculate a correlation coefficient for paired data sets. Tr using this technolog to find r. You should obtain results similar to the following. MINITAB Correlations: C1, C2 Pearson correlation of C1 and C2 = EXCEL A B C CORREL(A1:A25,B1:B25) Before using the TI- 83/84 to calculate r, ou must enter the Diagnostic On command. To do so, enter the following kestrokes: To eplore this topic further, see Activit 9.1 on page nd 0 cursor to DiagnosticOn ENTER ENTER. The following screens describe how to find r using a TI-83/84 with the data stored in List 1 and List 2. To begin, use the STAT kestroke. TI-83/84 EDIT CALC TESTS 1: 1 Var Stats 2: 2 Var Stats 3: Med Med 4: LinReg(a+b) 5: QuadReg 6: CubicReg 7 Quart Reg TI-83/84 LinReg(a+b) L1, L2 TI-83/84 LinReg =a+b a= b= r 2 = r= Correlation coefficient The result, r L 0.979, suggests a strong positive linear correlation. Tr It Yourself 5 Calculate the correlation coefficient for the data from the Chapter Opener on page 495 on 25 of the most epensive 20th Centur Fo movies. What can ou conclude? a. Enter the data. b. Use the appropriate feature to calculate r. c. What can ou conclude? Answer: Page A46

11 502 C H A P T E R 9 CORRELATION AND REGRESSION Stud Tip The level of significance is denoted b a, the lowercase Greek letter alpha. Using a Table to Test a Population Correlation Coefficient R Once ou have calculated r, the sample correlation coefficient, ou will want to determine whether there is enough evidence to decide that the population correlation coefficient r is significant. In other words, based on a few pairs of data, can ou make an inference about the population of all such data pairs? Remember that ou are using sample data to make a decision about population data, so it is alwas possible that our inference ma be wrong. In correlation studies, the small percentage of times when ou decide that the correlation is significant when it is reall not is called the level of significance. It is tpicall set at a = 0.01 or When a = 0.05, ou will probabl decide that the population correlation coefficient is significant when it is reall not 5% of the time. (Of course, 95% of the time, ou will correctl determine that a correlation coefficient is significant.) When a = 0.01, ou will make this tpe of error onl 1% of the time. When using a lower level of significance, however, ou ma fail to identif some significant correlations. In order for a correlation coefficient to be significant, its absolute value must be close to 1. To determine whether the population correlation coefficient r is significant, use the critical values given in Table 11 in Appendi B. A portion of the table is shown below. If ƒrƒ is greater than the critical value, there is enough evidence to decide that the correlation is significant. Otherwise, there is not enough evidence to sa that the correlation is significant. For instance, to determine whether r is significant for five pairs of data 1n = 52 at a level of significance of a = 0.01, ou need to compare ƒrƒ with a critical value of 0.959, as shown in the table. Number n of pairs of data in sample n = 0.05 = Critical values for α = 0.05 and α = 0.01 If ƒrƒ , the correlation is significant. Otherwise, there is not enough evidence to conclude that the correlation is significant. The guidelines for this process are as follows. Stud Tip If ou determine that the linear correlation is significant, then ou will be able to proceed to write the equation for the line that best describes the data. This line, called the regression line, can be used to predict the value of when given a value of. You will learn how to write this equation in the net section. G GUIDELINES U I D E L I N E S Using Table 11 for the Correlation Coefficient r In Words In Smbols 1. Determine the number of pairs Determine n. of data in the sample. 2. Specif the level of significance. Specif a. 3. Find the critical value. Use Table 11 in Appendi B. 4. Decide if the correlation is If ƒrƒ 7 critical value, the significant. correlation is significant. Otherwise, there is not enough evidence to conclude that the correlation is significant. 5. Interpret the decision in the contet of the original claim.

12 SECTION 9.1 CORRELATION 503 EXAMPLE 6 Using Table 11 for a Correlation Coefficient In Eample 5, ou used 25 pairs of data to find r L Is the correlation coefficient significant? Use a = Solution The number of pairs of data is 25, so n = 25. The level of significance is a = Using Table 11, find the critical value in the a = 0.05 column that corresponds to the row with n = 25. The number in that column and row is Insight Notice that the fewer data points in our stud, the stronger the evidence has to be to conclude that the correlation coefficient is significant. n = 0.05 = Because ƒrƒ L , ou can decide that the population correlation is significant. Interpretation At the 5% significance level, there is enough evidence to conclude that there is a significant linear correlation between the duration of Old Faithful s eruptions and the time between eruptions. Tr It Yourself 6 In Tr It Yourself 4, ou calculated the correlation of the income level and donating percent data to be r L Is the correlation coefficient significant? Use a = a. Determine the number of pairs of data in the sample. b. Specif the level of significance. c. Find the critical value. Use Table 11 in Appendi B. d. Compare ƒrƒ with the critical value and decide if the correlation is significant. e. Interpret the decision in the contet of the original claim. Answer: Page A46

13 504 C H A P T E R 9 CORRELATION AND REGRESSION Hpothesis Testing for a Population Correlation Coefficient R You can also use a hpothesis test to determine whether the sample correlation coefficient r provides enough evidence to conclude that the population correlation coefficient r is significant. A hpothesis test for r can be one-tailed or two-tailed. The null and alternative hpotheses for these tests are as follows. e H 0: r Ú 0 H a : r 6 0 e H 0: r 0 H a : r 7 0 (no significant negative correlation) (significant negative correlation) (no significant positive correlation) (significant positive correlation) Left-tailed test Right-tailed test e H 0: r = 0 H a : r Z 0 (no significant correlation) (significant correlation) Two-tailed test In this tet, ou will consider onl two-tailed hpothesis tests for rs. THE t -TEST FOR THE CORRELATION COEFFICIENT A t-test can be used to test whether the correlation between two variables is significant. The test statistic is r and the standardized test statistic t = r r = s r 1 - r 2 B n - 2 follows a t-distribution with n - 2 degrees of freedom. G GUIDELINES U I D E L I N E S Using the t-test for the Correlation Coefficient r In Words In Smbols 1. Identif the null and alternative hpotheses. State and H a. 2. Specif the level of significance. Identif a. 3. Identif the degrees of freedom. 4. Determine the critical value(s) and Use Table 5 in the rejection region(s). Appendi B. 5. Find the standardized test statistic. d.f. = n Make a decision to reject or fail to reject If t is in the rejecthe null hpothesis. tion region, reject H 0. Otherwise, fail to reject H Interpret the decision in the contet of the original claim. t = H 0 r 1 - r 2 B n - 2

14 SECTION 9.1 CORRELATION 505 EXAMPLE 7 The t-test for a Correlation Coefficient In Eample 4, ou used eight pairs of data to find r L Test the significance of this correlation coefficient. Use a = Solution The null and alternative hpotheses are Insight In Eample 7, ou can use Table 11 in Appendi B to test the population correlation coefficient r. Given n = 8 and a = 0.05, the critical value from Table 11 is Because ƒrƒ L , the correlation is significant. Note that this is the same result ou obtained using a t-test for the population correlation coefficient r. H 0 : r = 0 (no correlation) and H a : r Z 0 (significant correlation). Because there are eight pairs of data in the sample, there are 8-2 = 6 degrees of freedom. Because the test is a two-tailed test, a = 0.05, and d.f. = 6, the critical values are -t 0 = and t 0 = The rejection regions are t and t Using the t-test, the standardized test statistic is t = = r 1 - r 2 B n L C 8-2 The following graph shows the location of the rejection regions and the standardized test statistic. 1 α = α = Standardized test statistic t t 0 = t 0 = Stud Tip Be sure ou see in Eample 7 that rejecting the null hpothesis means that there is sufficient evidence that the correlation is significant. Because t is in the rejection region, ou should decide to reject the null hpothesis. Interpretation At the 5% significance level, there is enough evidence to conclude that there is a significant linear correlation between advertising epenses and compan sales. Tr It Yourself 7 In Tr It Yourself 5, ou calculated the correlation coefficient of 25 of the most epensive 20th Centur Fo movies to be r L Test the significance of this correlation coefficient. Use a = a. State the null and alternative hpotheses. b. Specif the level of significance. c. Identif the degrees of freedom. d. Determine the critical values and the rejection regions. e. Find the standardized test statistic. f. Make a decision to reject or fail to reject the null hpothesis. g. Interpret the decision in the contet of the original claim. Answer: Page A46

15 506 C H A P T E R 9 CORRELATION AND REGRESSION The following scatter plot shows the results of a surve conducted as a group project b students in a high school statistics class in the San Francisco area. In the surve, 125 high school students were asked their grade point average (GPA) and the number of caffeine drinks the consumed each da. GPA (5-point scale) Caffeine drinks (cups per da) What tpe of correlation, if an, does the scatter plot show between caffeine consumption and GPA? Correlation and Causation The fact that two variables are strongl correlated does not in itself impl a cause-and-effect relationship between the variables. More in-depth stud is usuall needed to determine whether there is a causal relationship between the variables. If there is a significant correlation between two variables, a researcher should consider the following possibilities. 1. Is there a direct cause-and-effect relationship between the variables? That is, does cause? For instance, consider the relationship between advertising epenses and compan sales that has been discussed throughout this section. It is reasonable to conclude that spending more mone on advertising will result in more sales. 2. Is there a reverse cause-and-effect relationship between the variables? That is, does cause? For instance, consider the Old Faithful data that have been discussed throughout this section. These variables have a positive linear correlation, and it is possible to conclude that the duration of an eruption affects the time before the net eruption. However, it is also possible that the time between eruptions affects the duration of the net eruption. 3. Is it possible that the relationship between the variables can be caused b a third variable or perhaps a combination of several other variables? For instance, consider the budgets and worldwide grosses of 25 of the most epensive movies distributed b 20th Centur Fo listed in the Chapter Opener. Although these variables have a positive linear correlation, it is doubtful that just because the budgets decrease, the worldwide grosses will also decrease. The relationship is probabl due to several other variables, such as advertising, the actors/actresses in the movies, and other movies that open in theaters at the same time. 4. Is it possible that the relationship between two variables ma be a coincidence? For instance, although it ma be possible to find a significant correlation between the number of animal species living in certain regions and the number of people who own more than two cars in those regions, it is highl unlikel that the variables are directl related. The relationship is probabl due to coincidence. Determining which of the above cases is valid for a data set can be difficult. For instance, consider the following eample. Suppose a person breaks out in a rash each time he eats shrimp at a certain restaurant. The natural conclusion is that the person is allergic to shrimp. However, upon further stud b an allergist, it is found that the person is not allergic to shrimp, but to a tpe of seasoning the chef is putting into the shrimp.

16 SECTION 9.1 CORRELATION EXERCISES For Etra Help Building Basic Skills and Vocabular 1. Which value of r indicates a stronger correlation: r = or r = ? Eplain our reasoning. 2. Which of the following values could not represent a correlation coefficient? Eplain wh. (a) r = (b) r = (c) r = (d) r = Eplain how to decide whether a sample correlation coefficient indicates that the population correlation coefficient is significant. 4. Discuss the difference between r and r. Graphical Analsis In Eercises 5 8, the scatter plots of paired data sets are shown. Determine whether there is a positive linear correlation, negative linear correlation, or no linear correlation between the variables Graphical Analsis In Eercises 9 12, the scatter plots show the results of a surve of 20 randoml selected males ages Using age as the eplanator variable, match each scatter plot to the appropriate description. Eplain our reasoning. (a) Age and bod temperature (b) Age and balance on student loans (c) Age and income (d) Age and height In thousands of units Age In units Age

17 508 C H A P T E R 9 CORRELATION AND REGRESSION In thousands of units Age In units Age 34 In Eercises 13 and 14, identif the eplanator variable and the response variable. 13. A nutritionist wants to determine if the amount of water consumed each da b persons of the same weight and on the same diet can be used to predict individual weight loss. 14. An insurance compan hires an actuar to determine whether the number of hours of safet driving classes can be used to predict the number of driving accidents for each driver. Using and Interpreting Concepts Constructing a Scatter Plot and Determining Correlation In Eercises 15 22, (a) displa the data in a scatter plot, (b) calculate the correlation coefficient r, and (c) make a conclusion about the tpe of correlation. 15. Age and Blood Pressure The ages (in ears) of 10 men and their sstolic blood pressures Age, Sstolic blood pressure, Age, Sstolic blood pressure, Age and Vocabular The ages (in ears) of 11 children and the number of words in their vocabular Age, Vocabular size, Age, Vocabular size, Hours Studing and Test Scores The number of hours 13 students spent studing for a test and their scores on that test Hours spent studing, Test score,

18 SECTION 9.1 CORRELATION Hours Online and Test Scores The number of hours 12 students spent online during the weekend and the scores of each student who took a test the following Monda Hours spent online, Test score, Hours spent online, Test score, Movie Budgets and Grosses The budget (in millions of dollars) and worldwide gross (in million of dollars) for eight of the most epensive movies ever made (Adapted from The Numbers) Budget, Gross, Speed of Sound The altitude (in thousands of feet) and speed of sound (in feet per second) Altitude, Speed of sound, Altitude, Speed of sound, Earnings and Dividends The earnings per share and dividends per share for 10 telecommunications services companies in a recent ear (Source: The Value Line Investment Surve) Earnings per share, Dividends per share, Earnings per share, Dividends per share, Crimes and Arrests The number of crimes reported (in millions) and the number of arrests reported (in millions) b the U.S. Department of Justice for 14 ears (Adapted from the National Crime Victimization Surve and Uniform Crime Reports) Crimes, Arrests, Crimes, Arrests,

19 510 C H A P T E R 9 CORRELATION AND REGRESSION Testing Claims In Eercises 23 28, use Table 11 in Appendi B as shown in Eample 6 or perform a hpothesis test using Table 5 in Appendi B as shown in Eample 7 to make a conclusion about the indicated correlation coefficient. If convenient, use technolog to solve the problem. 23. Braking Distances: Dr Surface The weights (in pounds) of eight vehicles and the variabilit of their braking distances (in feet) when stopping on a dr surface are shown in the table. Can ou conclude that there is a significant linear correlation between vehicle weight and variabilit in braking distance on a dr surface? Use a = (Adapted from National Highwa Traffic Safet Administration) Weight, Variabilit in braking distance, Braking Distances: Wet Surface The weights (in pounds) of eight vehicles and the variabilit of their braking distances (in feet) when stopping on a wet surface are shown in the table. At a = 0.05, can ou conclude that there is a significant linear correlation between vehicle weight and variabilit in braking distance on a wet surface? (Adapted from National Highwa Traffic Safet Administration) Weight, Variabilit in braking distance, Hours Studing and Test Scores The number of hours 13 students spent studing for a test and their scores on that test are shown in the table. Is there enough evidence to conclude that there is a significant linear correlation between the data? Use a = (Use the value of r found in Eercise 17.) Hours spent studing, Test score, Hours Online and Test Scores An instructor wants to show students that there is a linear correlation between the number of hours the spend online during a weekend and their scores on a test taken the following Monda. The number of hours spent online and the test scores for 12 randoml selected students are shown in the table. At a = 0.05, is there enough evidence for the instructor to conclude that there is a significant linear correlation between the data? (Use the value of r found in Eercise 18.) Hours spent online, Test score,

20 SECTION 9.1 CORRELATION Earnings and Dividends The following table shows the earnings per share and dividends per share for 10 telecommunications services companies in a recent ear. At a = 0.01, can ou conclude that there is a significant linear correlation between earnings per share and dividends per share? (Use the value of r found in Eercise 21.) (Source: The Value Line Investment Surve) Earnings per share, Dividends per share, Earnings per share, Dividends per share, Crimes and Arrests The following table shows the number of crimes reported (in millions) and the number of arrests reported (in millions) b the U.S. Department of Justice for 14 ears. At a = 0.05, can ou conclude that there is a significant linear correlation between the number of crimes and the number of arrests? (Use the value of r found in Eercise 22.) (Adapted from the National Crime Victimization Surve and Uniform Crime Reports) Crimes, Arrests, Crimes, Arrests, Etending Concepts Interchanging and In Eercises 29 and 30, calculate the correlation coefficient r, letting Row 1 represent the -values and Row 2 the -values. Then calculate the correlation coefficient r, letting Row 2 represent the -values and Row 1 the -values. What effect does switching the eplanator and response variables have on the correlation coefficient? 29. Row Row Row Row Writing Use our school s librar, the Internet, or some other reference source to find a real-life data set with the indicated cause-and-effect relationship. Write a paragraph describing each variable and eplain wh ou think the variables have the indicated cause-and-effect relationship. (a) Direct Cause-and-Effect: Changes in one variable cause changes in the other variable. (b) Other Factors: The relationship between the variables is caused b a third variable. (c) Coincidence: The relationship between the variables is a coincidence.

21 Correlation b Ee 9.1 The correlation b ee applet allows ou to guess the sample correlation coefficient, r, for a data set. When the applet loads, a data set consisting of 20 points is displaed. Points can be added to the plot b clicking the mouse. Points on the plot can be removed b clicking on the point and then dragging the point into the trash can. All of the points on the plot can be removed b simpl clicking inside the trash can. You can enter our guess for r in the Guess field, and then click SHOW R! to see if ou are within 0.1 of the true value. When ou click NEW DATA, a new data set is generated Trash Guess: True r: New data Show r! Eplore Step 1 Add five points to the plot. Step 2 Enter a guess for r. Step 3 Click SHOW R! Step 4 Click NEW DATA. Step 5 Remove five points from the plot. Step 6 Enter a guess for r. Step 7 Click SHOW R! Draw Conclusions 1. Generate a new data set. Using our knowledge of correlation, tr to guess the value of r for the data set. Repeat this 10 times. How man times were ou correct? Describe how ou chose each r value. 2. Describe how to create a data set with a value of r that is approimatel Describe how to create a data set with a value of r that is approimatel Tr to create a data set with a value of r that is approimatel Then tr to create a data set with a value of r that is approimatel 0.9. What did ou do differentl to create the two data sets? 512

22 SECTION 9.2 LINEAR REGRESSION Linear Regression What You SHOULD LEARN How to find the equation of a regression line How to predict -values using a regression equation Regression Lines Applications of Regression Lines Regression Lines After verifing that the linear correlation between two variables is significant, the net step is to determine the equation of the line that best models the data. This line is called a regression line, and its equation can be used to predict the value of for a given value of. Although man lines can be drawn through a set of points, a regression line is determined b specific criteria. Consider the scatter plot and the line shown below. For each data point, d i represents the difference between the observed -value and the predicted -value for a given -value on the line. These differences are called residuals and can be positive, negative, or zero. When the point is above the line, d i is positive. When the point is below the line, d i is negative. If the observed -value equals the predicted -value, d i = 0. Of all possible lines that can be drawn through a set of points, the regression line is the line for which the sum of the squares of all the residuals g d 2 i is a minimum. Observed -value d 4 d i d 1 d 2 d 3 Predicted -value d 5 For a given -value, d = (observed -value) (predicted -value) DEFINITION A regression line, also called a line of best fit, is the line for which the sum of the squares of the residuals is a minimum. Stud Tip When determining the equation of a regression line, it is helpful to construct a scatter plot of the data to check for outliers, which can greatl influence a regression line. You should also check for gaps and clusters in the data. In algebra, ou learned that ou can write an equation of a line b finding its slope m and -intercept b. The equation has the form = m + b. Recall that the slope of a line is the ratio of its rise over its run and the -intercept is the -value of the point at which the line crosses the -ais. It is the -value when = 0. In algebra, ou used two points to determine the equation of a line. In statistics, ou will use ever point in the data set to determine the equation of the regression line = m = = 2 1 b = 2(0) + 1 =

23 514 C H A P T E R 9 CORRELATION AND REGRESSION The equation of a regression line allows ou to use the independent (eplanator) variable to make predictions for the dependent (response) variable. Stud Tip Notice that both the slope m and the -intercept b in Eample 1 are rounded to three decimal places. This round-off rule will be used throughout the tet. THE EQUATION OF A REGRESSION LINE The equation of a regression line for an independent variable and a dependent variable is n = m + b where n is the predicted -value for a given -value. The slope m and -intercept b are given b m = ng - 1g 21g 2 ng 2-1g 2 2 and b = - m = g n - m g n where is the mean of the -values in the data set and is the mean of the -values. The regression line alwas passes through the point 1, 2. Advertising epenses (1000s of $), Compan sales (1000s of $), Income level (1000s of $), Donating percent, EXAMPLE 1 Finding the Equation of a Regression Line Find the equation of the regression line for the advertising ependitures and compan sales data used in Section 9.1. Solution In Eample 4 of Section 9.1, ou found that n = 8, g = 15.8, g = 1634, g = , and g 2 = You can use these values to calculate the slope and -intercept of the regression line as shown. m = ng - 1g 21g2 ng 2-1g 2 2 = b = - m = So, the equation of the regression line is n = To sketch the regression line, use an two -values within the range of data and calculate their corresponding -values from the regression line. Then draw a line through the two points. The regression line and scatter plot of the data are shown at the right. If ou plot the point 1, 2 = , , ou will notice that the line passes through this point. Tr It Yourself = L = L Compan sales (in thousands of dollars) Advertising epenses (in thousands of dollars) Find the equation of the regression line for the income level and donating percent data used in Section 9.1. a. Identif n, g, g, g, and g 2 from Tr It Yourself 4 of Section 9.1. b. Calculate the slope m. c. Calculate the -intercept b. d. Write the equation of the regression line. Answer: Page A

24 SECTION 9.2 LINEAR REGRESSION 515 Duration, Time, Duration, Time, EXAMPLE 2 Using Technolog to Find a Regression Equation Use a technolog tool to find the equation of the regression line for the Old Faithful data used in Section 9.1. Solution MINITAB, Ecel, and the TI-83/84 each have features that automaticall calculate a regression equation. Tr using this technolog to find the regression equation. You should obtain results similar to the following. MINITAB Regression Analsis: C2 versus C1 The regression equation is C2 = C1 Predictor Coef SE Coef T P Constant C S = R-Sq = 95.8% R-Sq(adj) = 95.6% To eplore this topic further, see Activit 9.2 on page EXCEL A B C D Slope: INDEX(LINEST(known_'s,known_'s),1) Y-intercept: INDEX(LINEST(known_'s,known_'s),2) TI-83/84 LinReg =a+b a= b= r 2 = r= From the displas, ou can see that the regression equation is n = The TI-83/84 displa at the right shows the regression line and a scatter plot of the data in the same viewing window. To do this, use Stat Plot to construct the scatter plot and enter the regression equation as Tr It Yourself 2 Use a technolog tool to find the equation of the regression line for the budgets and worldwide grosses of 25 of the most epensive 20th Centur Fo movies given in the Chapter Opener on page 495. a. Enter the data. b. Perform the necessar steps to calculate the slope and -intercept. c. Specif the regression equation. Answer: Page A46

25 516 C H A P T E R 9 CORRELATION AND REGRESSION Stud Tip If the correlation between and is not significant, the best predicted -value is, the mean of the -values in the data set. The following scatter plot shows the relationship between the number of motor vehicles in a state and the number of motor vehicle-caused deaths. (Adapted from U.S. Federal Highwa Administration and U.S. National Highwa Traffic Safet Administration) Motor vehicle deaths (in thousands) r FL TX CA Motor vehicles (in millions) Describe the correlation between these two variables in words. Use the scatter plot to predict the number of motor vehicle-caused deaths in a state that has 16 million motor vehicles. The regression line for this scatter plot is n Use this equation to make a prediction. (Assume and have a significant linear correlation.) How does our algebraic prediction compare with our graphical one? Applications of Regression Lines After finding the equation of a regression line, ou can use the equation to predict -values over the range of the data if the correlation between and is significant. For eample, an advertising eecutive could forecast compan sales on the basis of advertising ependitures. To predict -values, substitute the given -value into the regression equation, then calculate n, the predicted -value. EXAMPLE 3 Predicting -Values Using Regression Equations The regression equation for the advertising epenses (in thousands of dollars) and compan sales (in thousands of dollars) data is n = Use this equation to predict the epected compan sales for the following advertising epenses. (Recall from Section 9.1, Eample 7 that and have a significant linear correlation.) thousand dollars thousand dollars thousand dollars Solution To predict the epected compan sales, substitute each advertising ependiture for in the regression equation. Then calculate n. 1. n = Interpretation When the advertising = L epenses are $1500, the compan sales are about $180, n = Interpretation When the advertising = L epenses are $1800, the compan sales are about $195, n = Interpretation When the advertising = L epenses are $2500, the compan sales are about $230,884. Prediction values are meaningful onl for -values in (or close to) the range of the data. The -values in the original data set range from 1.4 to 2.6. So, it would not be appropriate to use the regression line n = to predict compan sales for advertising ependitures such as 0.5 ($500) or 5.0 ($5000). Tr It Yourself 3 The regression equation for the Old Faithful data is n = Use this to predict the time until the net eruption for each of the following eruption durations. (Recall from Section 9.1, Eample 6 that and have a significant linear correlation.) 1. 2 minutes minutes a. Substitute each value for into the regression equation. b. Calculate n. c. Specif the predicted time until the net eruption for each eruption duration. Answer: Page A46

26 SECTION 9.2 LINEAR REGRESSION EXERCISES For Etra Help Building Basic Skills and Vocabular In Eercises 1 8, match the description in the left column with a description in the right column. 1. Regression line 2. Residual 3. The -value of a data point a. The difference between the observed -value of a data point and the predicted -value on the line for the same data point corresponding to i b. n i 4. The -value for a point on the c. The line of best fit regression line corresponding to i 5. Slope 6. -intercept 7. The mean of the -values 8. The point a regression line alwas passes through d. i e. b f. 1, 2 g. m h. Graphical Analsis In Eercises 9 12, match the regression equation with the appropriate graph. (Note that the - and -aes are broken.) 9. n = n = n = n = a. b. Energ-efficienc rating Cooling capacit (in BTUs) Sstolic blood pressure Age (in ears) c. d. Fat (in grams) Protein (in grams) Leisure time (in hours per week) Work time (in hours per week)

27 518 C H A P T E R 9 CORRELATION AND REGRESSION Using and Interpreting Concepts Finding the Equation of a Regression Line In Eercises 13 22, find the equation of the regression line for the given data. Then construct a scatter plot of the data and draw the regression line. (Each pair of variables has a significant correlation.) Then use the regression equation to predict the value of for each of the given -values, if meaningful. If the -value is not meaningful to predict the value of, eplain wh not. If convenient, use technolog to solve the problem. 13. Miami Building Heights The heights (in feet) and the number of stories of nine notable buildings in Miami (Source: Emporis Buildings) Height, Stories, (a) = 500 feet (b) = 650 feet (c) = 310 feet (d) = 725 feet 14. Age and Vocabular The ages (in ears) of seven children and the number of words in their vocabular Age, Vocabular size, (a) = 2 ears (b) = 3 ears (c) = 6 ears (d) = 12 ears 15. Hours Studing and Test Scores The number of hours 13 students spent studing for a test and their scores on that test Hours spent studing, Test score, Hours spent studing, Test score, (a) = 3 hours (b) = 6.5 hours (c) = 13 hours (d) = 4.5 hours 16. Hours Online The number of hours 12 students spent online during the weekend and the scores of each student who took a test the following Monda Hours spent online, Test score, Hours spent online, Test score, (a) = 4 hours (b) = 8 hours (c) = 9 hours (d) = 15 hours

28 SECTION 9.2 LINEAR REGRESSION Hot Dogs: Caloric and Sodium Content The caloric content and the sodium content (in milligrams) for 10 beef hot dogs (Source: Consumer Reports) Calories, Sodium, Calories, Sodium, (a) = 170 calories (b) = 100 calories (c) = 140 calories (d) = 210 calories 18. Trees: Heights and Diameters The height (in feet) and trunk diameters (in inches) of 11 trees Height, Trunk diameter, Height, Trunk diameter, (a) = 74 feet (b) = 81 feet (c) = 95 feet (d) = 79 feet 19. Shoe Size and Height The shoe sizes and heights (in inches) for 14 men Shoe size, Height, Shoe size, Height, (a) = size 11.5 (b) = size 8.0 (c) = size 15.5 (d) = size Age and Hours Slept The age (in ears) and the number of hours slept in a da b 10 infants Age, Hours slept, Age, Hours slept, (a) = 0.3 ear (b) = 3.9 ears (c) = 0.6 ear (d) = 0.8 ear

29 520 C H A P T E R 9 CORRELATION AND REGRESSION Weight, Variabilit in braking distance, TABLE FOR EXERCISE Braking Distances: Dr Surface The weights (in pounds) of eight vehicles and the variabilit of their braking distances (in feet) when stopping on a dr surface (Adapted from National Highwa Traffic Safet Administration) (a) = 4500 pounds (b) = 6000 pounds (c) = 7500 pounds (d) = 5750 pounds 22. Braking Distances: Wet Surface The weights (in pounds) of eight vehicles and the variabilit of their braking distances (in feet) when stopping on a wet surface (Adapted from National Highwa Traffic Safet Administration) Weight, Variabilit in braking distance, (a) = 4800 pounds (b) = 5850 pounds (c) = 4075 pounds (d) = 3000 pounds 23. Writing Eplain how to predict -values using the equation of a regression line. 24. Writing Given a set of data and a corresponding regression line, describe all values of that provide meaningful predictions for. Age (in ears), Annual Salar (in thousands), Electrical Engineer Salar In Eercises 25 29, use the following information. You work for a salar analst and gather the data shown in the table. The table shows the ages of 13 electrical engineers and their annual salaries. (Adapted from Pascale, Inc.) 25. Correlation Using the scatter plot of the electrical engineer salar data shown, what tpe of correlation, if an, do ou think the data have? Eplain. 26. Regression Line Find an equation of the regression line for the data. Sketch a scatter plot of the data and draw the regression line. 27. Using the Regression Line The analst used the regression line ou found in Eercise 26 to predict the annual salar for an electrical engineer that is = 74 ears old. Is this a valid prediction? Eplain our reasoning. Age (in ears) 28. Significant Correlation? The analst claims that the population has a significant correlation for a = Verif this claim. 29. Cause and Effect Write a paragraph describing the cause-and-effect relationship between the age and the annual salar of electrical engineers. Average salar (in thousands) Electrical Engineer Etending Concepts Interchanging and In Eercises 30 and 31, do the following. (a) Find the equation of the regression line for the given data, letting Row 1 represent the -values and Row 2 the -values. Sketch a scatter plot of the data and draw the regression line.

30 SECTION 9.2 LINEAR REGRESSION 521 (b) Find the equation of the regression line for the given data, letting Row 2 represent the -values and Row 1 the -values. Sketch a scatter plot of the data and draw the regression line. (c) What effect does switching the eplanator and response variables have on the regression line? 30. Row Row Row Row Residual Plots A residual plot allows ou to assess correlation data and check for possible problems with a regression model. To construct a residual plot, make a scatter plot of (, - n ), where - n is the residual of each -value. If the resulting plot shows an tpe of pattern, the regression line is not a good representation of the relationship between the two variables. If it does not show a pattern, that is, the residuals fluctuate about 0, then the regression line is a good representation. Be aware that if a point on the residual plot appears to be outside the pattern of the other points, then it ma be an outlier. In Eercises 32 and 33, (a) find the equation of the regression line, (b) construct a scatter plot of the data and draw the regression line, (c) construct a residual plot, and (d) determine if there are an patterns in the residual plot and eplain what the suggest about the relationship between the variables Influential Points An influential point is a point in a data set that can greatl affect the graph of a regression line. An outlier ma or ma not be an influential point. To determine if a point is influential, find two regression lines: one including all the points in the data set, and the other ecluding the possible influential point. If the slope or -intercept of the regression line shows significant changes, the point can be considered influential. An influential point can be removed from a data set onl if there is proper justification. In Eercises 34 and 35, (a) construct a scatter plot of the data, (b) identif an possible outliers, and (c) determine if the point is influential. Eplain our reasoning

31 522 C H A P T E R 9 CORRELATION AND REGRESSION 36. Chapter Opener Consider the data from the Chapter Opener on page 495 on 25 of the most epensive movies distributed b 20th Centur Fo. Is the data point (200, ) an outlier? If so, is it influential? Eplain. Transformations to Achieve Linearit When a linear model is not appropriate to represent data, other models can be used. In some cases, the values of or must be transformed to find an appropriate model. In a logarithmic transformation, the logarithms of the variables are used instead of using the original variables when creating a scatter plot and calculating the regression line. Number of hours, Number of bacteria, In Eercises 37 40, use the data shown in the table that shows the number of bacteria present after a certain number of hours. 37. Find the equation of the regression line for the data. Then construct a scatter plot of 1, 2 and sketch the regression line with it. 38. Replace each -value in the table with its logarithm, log. Find the equation of the regression line for the transformed data. Then construct a scatter plot of 1, log 2 and sketch the regression line with it. What do ou notice? 39. An eponential equation is a nonlinear regression equation of the form = ab. Use a technolog tool to find the eponential equation for the original data. Note that ou can also find this model b solving the equation log = m + b from Eercise 38 for. 40. Compare our results in Eercise 39 with the equation of the regression line and its graph in Eercise 37. Which equation is a better model for the data? Eplain. In Eercises 41 44, use the data shown in the table. 41. Find the equation of the regression line for the data. Then construct a scatter plot of 1, 2 and sketch the regression line with it. 42. Replace each -value and -value in the table with its logarithm. Find the equation of the regression line for the transformed data. Then construct a scatter plot of 1log, log 2 and sketch the regression line with it. What do ou notice? 43. A power equation is a nonlinear regression equation of the form = a b. Use a technolog tool to find the power equation for the original data. Note that ou can also find this model b solving the equation log = m1log 2 + b from Eercise 42 for. 44. Compare our results in Eercise 43 with the equation of the regression line and its graph in Eercise 41. Which equation is a better model for the data? Eplain. Logarithmic Equation In Eercises 45 48, use the following information and a technolog tool. The logarithmic equation is a nonlinear regression equation of the form = a + b ln. 45. Find and graph the logarithmic equation for the data given in Eercise Find and graph the logarithmic equation for the data given in Eercise Compare our results in Eercise 45 with the equation of the regression line and its graph. Which equation is a better model for the data? Eplain. 48. Compare our results in Eercise 46 with the equation of the regression line and its graph. Which equation is a better model for the data? Eplain.

32 Regression b Ee The regression b ee applet allows ou to interactivel estimate the regression line for a data set. When the applet loads, a data set consisting of 20 points is displaed. Points on the plot can be added to the plot b clicking the mouse. Points on the plot can be removed b clicking on the point and then dragging the point into the trash can. All of the points on the plot can be removed b simpl clicking inside the trash can. You can move the green line on the plot b clicking and dragging the endpoints.you should tr to move the line in order to minimize the sum of the squares of the residuals, also known as the sum of square error (SSE). Note that the regression line minimizes SSE.The SSE for the green line and for the regression line are given below the plot. The equations of each line are given above the plot. Click SHOW REGRESSION LINE! to see the regression line in the plot. Click NEW DATA to generate a new data set Green line: = Regression line: = Eplore Step 1 Step 2 Move the endpoints of the green line to tr to approimate the regression line. Click SHOW REGRESSION LINE! New data Green SSE: Regression SSE: Show regression line! Trash Draw Conclusions 1. Click NEW DATA to generate a new data set. Tr to move the green line to where the regression line should be. Then click SHOW REGRESSION LINE!. Repeat this five times. Describe how ou moved each green line. 2. On a blank plot, place 10 points so that the have a strong positive correlation. Record the equation of the regression line. Then, add a point in the upper left corner of the plot and record the equation of the regression line. How does the regression line change? 3. Remove the point from the upper-left corner of the plot. Add 10 more points so that there is still a strong positive correlation. Record the equation of the regression line. Add a point in the upper-left corner of the plot and record the equation of the regression line. How does the regression line change? 4. Use the results of Eercises 2 and 3 to describe what happens to the slope of the regression line when an outlier is added as the sample size increases. 523

33 Hip and Abdomen Circumferences Correlation of Bod Measurements In a stud published in Medicine and Science in Sports and Eercise (volume 17, no. 2, page 189) the measurements of 252 men (ages 22 81) are given. Of the 14 measurements taken of each man, some have significant correlations and others don t. For instance, the scatter plot at the right shows that the hip and abdomen circumferences of the men have a strong linear correlation 1r = The partial table shown here lists onl the first nine rows of the data. Abdomen circumference (in centimeters) Hip circumference (in centimeters) Age (r) Weight (lb) Height (in.) Neck (cm) Chest (cm) Abdom. (cm) Hip (cm) Thigh (cm) Knee (cm) Ankle (cm) Bicep (cm) Forearm (cm) Wrist (cm) Bod fat % Eercises 1. Using our intuition, classif the following 1, 2 pairs as having a weak correlation 10 6 r , a moderate correlation r , or a strong correlation r (a) (weight, neck) (b) (weight, height) (c) (age, bod fat) (d) (chest, hip) (e) (age, wrist) (f) (ankle, wrist) (g) (forearm, height) (h) (bicep, forearm) (i) (weight, bod fat) ( j) (knee, thigh) (k) (hip, abdomen) ( l) (abdomen, hip) 2. Now, use a technolog tool to find the correlation coefficient for each pair in Eercise 1. Compare our results with those obtained b intuition. 3. Use a technolog tool to find the regression line for each pair in Eercise 1 that has a strong correlation. 4. Use the results of Eercise 3 to predict the following. (a) The neck circumference of a man whose weight is 180 pounds (b) The abdomen circumference of a man whose hip circumference is 100 centimeters 5. Are there pairs of measurements that have stronger correlation coefficients than 0.85? Use a technolog tool and intuition to reach a conclusion. 524

34 SECTION 9.3 MEASURES OF REGRESSION AND PREDICTION INTERVALS Measures of Regression and Prediction Intervals What You SHOULD LEARN How to interpret the three tpes of variation about a regression line How to find and interpret the coefficient of determination How to find and interpret the standard error of estimate for a regression line How to construct and interpret a prediction interval for Variation about a Regression Line The Coefficient of Determination The Standard Error of Estimate Prediction Intervals Variation about a Regression Line In this section, ou will stud two measures used in correlation and regression studies the coefficient of determination and the standard error of estimate. You will also learn how to construct a prediction interval for using a regression line and a given value of. Before studing these concepts, ou need to understand the three tpes of variation about a regression line. To find the total variation, the eplained variation, and the uneplained variation about a regression line, ou must first calculate the total deviation, the eplained deviation, and the uneplained deviation for each ordered pair 1 i, i 2 in a data set. These deviations are shown in the graph. Total deviation = i - Eplained deviation = n i - Uneplained deviation = i - n i After calculating the deviations for each data point 1 i, i 2, ou can find the total variation, the eplained variation, and the uneplained variation. ( i, i ) Total deviation i ( i, i ) ( i, ) Uneplained deviation i i Eplained deviation i Stud Tip Consider the advertising and sales data used throughout this chapter with a regression line of n = Using the data point 12.0, 2202, ou can find the total, eplained, and uneplained deviations as follows. Total deviation: i - = = Eplained deviation: n i - = = Uneplained deviation: i - n i = = DEFINITION The total variation about a regression line is the sum of the squares of the differences between the -value of each ordered pair and the mean of. Total variation = g 1 i The eplained variation is the sum of the squares of the differences between each predicted -value and the mean of. Eplained variation = g 1n i The uneplained variation is the sum of the squares of the differences between the -value of each ordered pair and each corresponding predicted -value. Uneplained variation = g 1 i - n i 2 2 The sum of the eplained and uneplained variations is equal to the total variation. Total variation = Eplained variation + Uneplained variation As its name implies, the eplained variation can be eplained b the relationship between and. The uneplained variation cannot be eplained b the relationship between and and is due to chance or other variables.

35 526 C H A P T E R 9 CORRELATION AND REGRESSION The Coefficient of Determination You alread know how to calculate the correlation coefficient r. The square of this coefficient is called the coefficient of determination. It can be shown that the coefficient of determination is equal to the ratio of the eplained variation to the total variation. Janette Benson (Pscholog Department, Universit of Denver) performed a stud relating the age that infants crawl (in weeks after birth) with the average monthl temperature si months after birth. Her results are based on a sample of 414 infants. Janette Benson believes that the reason for the correlation in temperature to crawling age is that parents tend to bundle infants in more restrictive clothing and blankets during cold months. This bundling doesn t allow the infant as much opportunit to move and eperiment with crawling. Crawling age (in weeks) Temperature (in F) The correlation coefficient is r What percent of the variation in the data can be eplained? What percent is due to chance, sampling error, or other factors? DEFINITION The coefficient of determination the total variation. That is, r 2 = It is important that ou interpret the coefficient of determination correctl. For instance, if the correlation coefficient is r = 0.90, then the coefficient of determination is r 2 = = Eplained variation. Total variation r 2 is the ratio of the eplained variation to This means that 81% of the variation of can be eplained b the relationship between and. The remaining 19% of the variation is uneplained and is due to other factors or to sampling error. EXAMPLE 1 Finding the Coefficient of Determination The correlation coefficient for the advertising epenses and compan sales data as calculated in Eample 4 of Section 9.1 is r L Find the coefficient of determination. What does this tell ou about the eplained variation of the data about the regression line? About the uneplained variation? Solution r 2 = L The coefficient of determination is Interpretation About 83.4% of the variation in the compan sales can be eplained b the variation in the advertising ependitures. About 16.6% of the variation is uneplained and is due to chance or other variables. Tr It Yourself 1 The correlation coefficient for the Old Faithful data as calculated in Eample 5 of Section 9.1 is r L Find the coefficient of determination. What does this tell ou about the eplained variation of the data about the regression line? About the uneplained variation? a. Identif the correlation coefficient r. b. Calculate the coefficient of determination r 2. c. What percent of the variation in the times is eplained? What percent is uneplained? Answer: Page A46

36 SECTION 9.3 MEASURES OF REGRESSION AND PREDICTION INTERVALS 527 The Standard Error of Estimate When a n -value is predicted from an -value, the prediction is a point estimate. You can construct an interval estimate for n, but first ou need to calculate the standard error of estimate. DEFINITION The standard error of estimate s e is the standard deviation of the observed -values about the predicted n -value for a given i -value. It is given b i s e = A g1 i - n i 2 2 n - 2 where n is the number of ordered pairs in the data set. From this formula, ou can see that the standard error of estimate is the square root of the uneplained variation divided b n - 2. So, the closer the observed -values are to the predicted -values, the smaller the standard error of estimate will be. G GUIDELINES U I D E L I N E S Finding the Standard Error of Estimate s e Shoe size, Men s heights (in inches), In Words In Smbols 1. Make a table that includes the column i, i, n i, 1 i - n i 2, headings shown at the right. 1 i - n i Use the regression equation to calculate n i = m i + b the predicted -values. 3. Calculate the sum of the squares of the g 1 i - n i 2 2 differences between each observed -value and the corresponding predicted -value. g1 i - n i Find the standard error of estimate. s e = A You can also find the standard error of estimate using the following formula. s e = A g 2 - bg - mg n - 2 This formula is eas to use if ou have alread calculated the slope m, the -intercept b, and several of the sums. For instance, the regression line for the data set given at the left is n = , and the values of the sums are g 2 = 80,877.5, g = 1137, and g = 11, When the alternative formula is used, the standard error of estimate is s e = A g 2 - bg - mg n - 2 = B 80, , n - 2 L

37 528 C H A P T E R 9 CORRELATION AND REGRESSION EXAMPLE 2 Finding the Standard Error of Estimate The regression equation for the advertising epenses and compan sales data as calculated in Eample 1 of Section 9.2 is n = Find the standard error of estimate. Solution Use a table to calculate the sum of the squared differences of each observed -value and the corresponding predicted -value. Uneplained variation When n = 8 and g 1 i - n i 2 2 = are used, the standard error of estimate is s e = A g 1 i - n i 2 2 L n - 2 = B i i n i 1 i n i g = Interpretation The standard error of estimate of the compan sales for a specific advertising epense is about $10,290. Radio ad time Weekl sales Tr It Yourself 2 A researcher collects the data shown at the left and concludes that there is a significant relationship between the amount of radio advertising time (in minutes per week) and the weekl sales of a product (in hundreds of dollars). Find the standard error of estimate. Use the regression equation n = a. Use a table to calculate the sum of the squared differences of each observed -value and the corresponding predicted -value. b. Identif the number n of ordered pairs in the data set. c. Calculate s e. d. Interpret the results. Answer: Page A46

38 SECTION 9.3 MEASURES OF REGRESSION AND PREDICTION INTERVALS 529 Prediction Intervals 2 = m + b n Two variables have a bivariate normal distribution if for an fied values of, the corresponding values of are normall distributed and for an fied values of, the corresponding values of are normall distributed. Because regression equations are determined using sample data and because and are assumed to have a bivariate normal distribution, ou can construct a prediction interval for the true value of. To construct the prediction interval, use a t-distribution with n - 2 degrees of freedom n Bivariate Normal Distribution DEFINITION Given a linear regression equation n = m + b and 0, a specific value of,a c-prediction interval for is where n - E 6 6 n + E E = t c s e A n + n ng 2-1g 2 2. The point estimate is n and the margin of error is E.The probabilit that the prediction interval contains is c. G GUIDELINES U I D E L I N E S Construct a Prediction Interval for for a Specific Value of Stud Tip The formulas for s e and E use the quantities g1 1g 2 2 i - n i 2 2,, and g 2. Use a table to calculate these quantities. In Words 1. Identif the number of ordered pairs in the data set n and the degrees of freedom. 2. Use the regression equation and the given -value to find the point estimate n. In Smbols d.f. = n Find the critical value t c that Use Table 5 in Appendi B. corresponds to the given level of confidence c. 4. Find the standard error of estimate s e. 5. Find the margin of error E. n i = m i + b s e = A g 1 i - n i 2 2 n - 2 E = t c s e A n + n ng 2-1g Find the left and right Left endpoint: n - E endpoints and form the Right endpoint: n + E prediction interval. Interval: n - E 6 6 n + E

39 530 C H A P T E R 9 CORRELATION AND REGRESSION EXAMPLE 3 Constructing a Prediction Interval Using the results of Eample 2, construct a 95% prediction interval for the compan sales when the advertising epenses are $2100. What can ou conclude? Solution Because n = 8, there are 8-2 = 6 degrees of freedom. Using the regression equation n = and = 2.1 the point estimate is n = = L From Table 5, the critical value is t c = and from Eample 2, s e L Using these values, the margin of error is Insight The greater the difference between and, the wider the prediction interval is. For instance, in Eample 3 the 95% prediction intervals for are shown below. Compan sales (in thousands of dollars) % prediction intervals Advertising epenses (in thousands of dollars) E = t c s e A n + n n1g 2 2-1g 2 2 = A L Using n = and E = , the prediction interval is Left Endpoint = Right Endpoint = Interpretation You can be 95% confident that when advertising epenses are $2100, the compan sales will be between $183,735 and $237,449. Tr It Yourself Construct a 95% prediction interval for the compan sales when the advertising epenses are $2500. What can ou conclude? a. Specif n, d.f., t c, s e. b. Calculate n when = 2.5. c. Calculate the margin of error E. d. Construct the prediction interval. e. Interpret the results. Answer: Page A46

40 SECTION 9.3 MEASURES OF REGRESSION AND PREDICTION INTERVALS EXERCISES For Etra Help Building Basic Skills and Vocabular Graphical Analsis In Eercises 1 3, use the graph to answer the question. ( i, i ) ( i, i ) ( i, ) = 1. Describe the total variation about a regression line in words and in smbols. 2. Describe the eplained variation about a regression line in words and in smbols. 3. Describe the uneplained variation about a regression line in words and in smbols. 4. The coefficient of determination is the ratio of which two tpes of variations? What does the coefficient of determination measure? In Eercises 5 8, use the value of the linear correlation coefficient to calculate the coefficient of determination. What does this tell ou about the eplained variation of the data about the regression line? About the uneplained variation? 5. r = r = r = r = Using and Interpreting Concepts Finding Tpes of Variations and the Coefficient of Determination In Eercises 9 16, use the data to find the (a) coefficient of determination and interpret the result, and (b) standard error of estimate and interpret the result. 9. Stock Offerings The number of initial public offerings of stock issued in a recent 12-ear period and the total proceeds of these offerings (in millions of U.S. dollars) are shown in the table. The equation of the regression line is n = , (Source: Securities Data Co.) Number of issues, Proceeds, 17,784 28,745 42,572 32,478 34,585 65,069 s e Number of issues, Proceeds, 65,627 34,368 22,136 10,122 32,380 28,677

41 532 C H A P T E R 9 CORRELATION AND REGRESSION 10. Cigarettes The table shows the number of cigarettes consumed (in billions) in the United States and the number of cigarettes eported (in billions) from the United States for seven ears. The equation of the regression line is n = (Adapted from Alcohol and Tobacco Ta and Trade Bureau) Cigarettes consumed in U.S., Cigarettes eported b U.S., Retail Space and Sales The table shows the total square footage (in billions) of retailing space at shopping centers and their sales (in billions of U.S. dollars) for 11 ears. The equation of the regression line is n = (Adapted from International Council of Shopping Centers) Total square footage, Sales, Total square footage, Sales, Work and Leisure Time The median number of work hours per week and the median number of leisure hours per week for people in the United States for 10 recent ears are shown in the table. The equation of the regression line is n = (Source: Louis Harris & Associates) Median no. of work hrs. per week, Median no. of leisure hrs. per week, Median no. of work hrs. per week, Median no. of leisure hrs. per week, Earnings of Men and Women The table shows median weekl earnings (in U.S. dollars) of full-time male and female workers for five ears. The equation of the regression line is n = (Source: U.S. Bureau of Labor Statistics) Median weekl earnings of male workers, Median weekl earnings of female workers,

42 SECTION 9.3 MEASURES OF REGRESSION AND PREDICTION INTERVALS Voter Turnout The U.S. voting age population (in millions) and the turnout of the voting age population (in millions) for federal elections for eight nonpresidential election ears are shown in the table. The data can be modeled b the regression equation n = (Adapted from Federal Election Commission) Voting age population, Turnout in federal elections, Voting age population, Turnout in federal elections, Campaign Mone The mone raised and spent (both in millions of U.S. dollars) b all congressional campaigns for 8 recent 2-ear periods are shown in the table. The data can be modeled b the regression equation n = (Source: Federal Election Commission) Mone raised, Mone spent, Fund Assets The table shows the total assets (in billions of U.S. dollars) of individual retirement accounts (IRAs) and federal pension plans for nine ears. The equation of the regression line is n = (Source: Investment Compan Institute) IRAs, Federal pension plans, IRAs, Federal pension plans, Constructing and Interpreting Prediction Intervals In Eercises 17 24, construct the indicated prediction interval and interpret the results. 17. Proceeds Construct a 95% prediction interval for the proceeds from initial public offerings in Eercise 9 when the number of issues is Cigarettes Consumed Construct a 95% prediction interval for the cigarettes eported b the United States in Eercise 10 when the number of cigarettes consumed in the United States is 450 billion. 19. Retail Sales Using the results of Eercise 11, construct a 90% prediction interval for shopping center sales when the total square footage of shopping centers is 4.5 billion. 20. Leisure Hours Using the results of Eercise 12, construct a 90% prediction interval for the median number of leisure hours per week when the median number of work hours per week is 45.1.

43 534 C H A P T E R 9 CORRELATION AND REGRESSION 21. Earnings of Female Workers When the median weekl earnings of male workers is $650, find a 99% prediction interval for the median weekl earnings of female workers. Use the results of Eercise Predicting Voter Turnout When the voting age population is 210 million, construct a 99% prediction interval for the voter turnout in federal elections. Use the results of Eercise Campaign Spending A total of $775.8 million is raised in one 2-ear period for congressional campaigns. Construct a prediction interval for the mone spent b the campaigns. Use the results of Eercise 15 and c = Total Assets The total assets in IRAs is $2500 billion. Construct a prediction interval for the total assets in federal pension plans. Use the results of Eercise 16 and c = Etending Concepts Old Vehicles In Eercises 25 31, use the information shown at the right. 25. Scatter Plot Construct a scatter plot of the data. Show and on the graph. 26. Regression Line Find and graph the regression line. 27. Deviation Calculate the eplained deviation, the uneplained deviation, and the total deviation for each data point. Keeping cars longer. The median age of vehicles on U.S. roads for seven different ears: Median age in ears Cars, Trucks, (Source: Polk Co.) 28. Variation Find the (a) eplained variation, (b) uneplained variation, and (c) total variation. 29. Coefficient of Determination Find the coefficient of determination. What can ou conclude? 30. Error of Estimate Find the standard error of estimate and interpret the results. 31. Prediction Interval Construct a 95% prediction interval for the median age of trucks in use when the median age of cars in use is 8.6 ears. 32. Correlation Coefficient and Slope Recall that the formula for the correlation coefficient r is s e r = ng - 1g 21g 2 2ng 2-1g 2 2 2ng 2-1g 2 2 and the formula for the slope m of a regression line is m = ng - 1g 21g 2 ng 2-1g 2 2. Given a set of data, wh must the slope m of the data s regression line alwas have the same sign as the data s correlation coefficient r?

44 SECTION 9.3 MEASURES OF REGRESSION AND PREDICTION INTERVALS 535 Hpothesis Testing for Slope In Eercises 33 and 34, use the following information. When testing the slope M of the regression line for the population, ou usuall test that the slope is zero, or H 0 : M = 0. A slope of zero indicates that there is no linear relationship between and. To perform the t-test for the slope M, use the standardized test statistic t = m s e A g2-1g22 n with n - 2 degrees of freedom. Then, using the critical values found in Table 5 in Appendi B, make a decision whether to reject or fail to reject the null hpothesis. You can also use the LinRegTTest feature on a TI-83/84 to calculate the standardized test statistic as well as the corresponding P-value. If P a, then reject the null hpothesis. If P 7 a, then do not reject H The following table shows the weights (in pounds) and the number of hours slept in a da b a random sample of infants. Test the claim that M Z 0. Use a = Then interpret the results in the contet of the problem. If convenient, use technolog to solve the problem. Weight, Hours slept, The following table shows the ages (in ears) and salaries (in thousands of dollars) for a random sample of engineers at a compan. Test the claim that M Z 0. Use a = Then interpret the results in the contet of the problem. If convenient, use technolog to solve the problem. Age, Salar, Confidence Intervals for -Intercept and Slope You can construct confidence intervals for the -intercept B and slope M of the regression line = M + B for the population b using the following inequalities. -intercept B: slope M: b - E 6 B 6 b + E where E = t c s 1 e n + 2 Q g 2-1g22 n m - E 6 M 6 m + E and where E = t c s e A g2-1g22 n The values of m and b are obtained from the sample data, and the critical value t c is found using Table 5 in Appendi B with n - 2 degrees of freedom. In Eercises 35 and 36, construct the indicated confidence interval. 35. Construct the 95% confidence interval for B and M using the advertising epenses and compan sales data found in Eample Construct the 99% confidence interval for B and M using the advertising epenses and compan sales data found in Eample 2.

45 536 C H A P T E R 9 CORRELATION AND REGRESSION 9.4 Multiple Regression What You SHOULD LEARN How to use technolog to find a multiple regression equation, the standard error of estimate, and the coefficient of determination How to use a multiple regression equation to predict -values Finding a Multiple Regression Equation Predicting -Values Finding a Multiple Regression Equation In man instances, a better prediction model can be found for a dependent (response) variable b using more than one independent (eplanator) variable. For eample, a more accurate prediction for the compan sales discussed in previous sections might be made b considering the number of emploees on the sales staff as well as the advertising epenses. Models that contain more than one independent variable are multiple regression models. DEFINITION A multiple regression equation has the form n = b + m m m Á + m k k where 1, 2, 3, Á, k are the independent variables, b is the -intercept, and is the dependent variable. Insight Because the mathematics associated with multiple regression is complicated, this section focuses on how to use technolog to find a multiple regression equation and how to interpret the results. The -intercept b is the value of when all i are 0. Each coefficient m i is the amount of change in when the independent variable i is changed b one unit and all other independent variables are held constant. EXAMPLE 1 Finding a Multiple Regression Equation A researcher wants to determine how emploee salaries at a certain compan are related to the length of emploment, previous eperience, and education. The researcher selects eight emploees from the compan and obtains the following data. Emploee Salar, Emploment (in ears), 1 Eperience (in ears), 2 Education (in ears), 3 A 57, B 57, C 54, D 56, E 58, F 60, G 59, H 60, Use MINITAB to find a multiple regression equation that models the data.

46 SECTION 9.4 MULTIPLE REGRESSION 537 Solution Enter the -values in C1 and the 1 -, 2 -, and 3 -values in C2, C3, and C4, respectivel. Select RegressionNRegression Á from the Stat menu. Using the salaries as the response variable and the remaining data as the predictors, ou should obtain results similar to the following. Stud Tip In Eample 1, it is important that ou interpret the coefficients m 1, m 2, and m 3 correctl. For instance, if 2 and 3 are held constant and 1 increases b 1, then increases b $364. Similarl, if 1 and 3 are held constant and 2 increases b 1, then increases b $228. If 1 and 2 are held constant and 3 increases b 1, then increases b $267. MINITAB Regression Analsis: Salar, versus 1, 2, 3 The regression equation is Salar, = Predictor Coef SE Coef T P b Constant m m m 3 S = R-Sq = 94.4% R-Sq(adj) = 90.2% The regression equation is n = 49, Tr It Yourself 1 A statistics professor wants to determine how students final grades are related to the midterm eam grades and number of classes missed. The professor selects 10 students from her class and obtains the following data. Student Final grade, Midterm eam, 1 Classes missed, Use technolog to find a multiple regression equation that models the data. a. Enter the data. b. Calculate the regression line. Answer: Page A47 MINITAB displas much more than the regression equation and the coefficients of the independent variables. For eample, it also displas the standard error of estimate, denoted b S, and the coefficient of determination, denoted b R-Sq. In Eample 1, S = and R-Sq = 94.4%. So, the standard error of estimate is $ The coefficient of determination tells ou that 94.4% of the variation in can be eplained b the multiple regression model. The remaining 5.6% is uneplained and is due to other factors or chance.

47 538 C H A P T E R 9 CORRELATION AND REGRESSION Predicting -Values After finding the equation of the multiple regression line, ou can use the equation to predict -values over the range of the data. To predict -values, substitute the given value for each independent variable into the equation, then calculate n. In a lake in Finland, 159 fish of 7 species were caught and measured for weight G, length L, height H, and width W (H and W are percents of L). The regression equation for G and L is G = L, r = When all four variables are used, the regression equation is G = L H W, r = H EXAMPLE 2 Predicting -Values Using Multiple Regression Equations Use the regression equation found in Eample 1 to predict an emploee s salar given the following conditions ears of current emploment, 5 ears of eperience, and 16 ears of education 2. 4 ears of current emploment, 2 ears of eperience, and 12 ears of education 3. 8 ears of current emploment, 7 ears of eperience, and 17 ears of education Solution To predict each emploee s salar, substitute the values for 1, 2, and 3 into the regression equation. Then calculate n. 1. n = 49, = 49, = 59,544 The emploee s predicted salar is $59,544. L W Predict the weight of a fish with the following measurements: L 40, H 17, and W 11. How do our predictions var when ou use a single variable versus man variables? Which do ou think is more accurate? n = 49, = 49, = 54,880 The emploee s predicted salar is $54,880. n = 49, = 49, = 58,811 The emploee s predicted salar is $58,811. Tr It Yourself 2 Use the regression equation found in Tr It Yourself 1 to predict a student s final grade given the following conditions. 1. A student has a midterm eam score of 89 and misses 1 class. 2. A student has a midterm eam score of 78 and misses 3 classes. 3. A student has a midterm eam score of 83 and misses 2 classes. a. Substitute the midterm score for 1 into the regression equation. b. Substitute the corresponding number of missed classes for 2 into the regression equation. c. Calculate n. d. What is each student s final grade? Answer: Page A47

48 SECTION 9.4 MULTIPLE REGRESSION EXERCISES For Etra Help Building Basic Skills and Vocabular Predicting -Values In Eercises 1 4, use the multiple regression equation to predict the -values for the given values of the independent variables. 1. Cotton Yield The equation used to predict cotton ield (in pounds) is n = where 1 is the number of acres planted (in thousands) and 2 is the number of acres harvested (in thousands). (Source: U.S. National Agricultural Statistics Service) (a) 1 = 13,500, (c) 1 = 14,000, 2. Wheat Yield To predict the annual wheat ield (in bushels per acre), use the equation n = where 1 is the number of acres planted (in thousands) and 2 is the number of acres harvested (in thousands). (Source: U.S. National Agricultural Statistics Service) (a) 1 = 60,000, (c) 1 = 57,500, 3. Black Cherr Tree Volume The volume (in cubic feet) of black cherr trees can be modeled b the equation n = where 1 is the tree s height (in feet) and 2 is the tree s diameter (in inches). (Source: Journal of the Roal Statistical Societ) (a) 1 = 70, (c) 1 = 83, 4. Earnings per Share The earnings per share (in dollars) for McDonald s Corporation are given b the equation n = where 1 represents total revenue (in billions of dollars) and 2 represents shareholders equit (in billions of dollars). (Source: McDonald s Corporation) (a) 1 = 15.2, (c) 1 = 12.1, 2 = 12,000 2 = 13,500 2 = 45,000 2 = 50,000 2 = = = = 9.1 (b) 1 = 15,000, (d) 1 = 14,500, (b) 1 = 62,500, (d) 1 = 59,000, (b) 1 = 65, (d) 1 = 87, (b) 1 = 19.8, (d) 1 = 21.5, 2 = 13,500 2 = 13,000 2 = 53,000 2 = 48,500 2 = = = = 15.8 Using and Interpreting Concepts Finding a Multiple Regression Equation In Eercises 5 and 6, use technolog to find the multiple regression equation for the data shown in the table. Then answer the following. (a) What is the standard error of estimate? (b) What is the coefficient of determination? (c) Interpret the results of (a) and (b).

49 540 C H A P T E R 9 CORRELATION AND REGRESSION 5. Sales The total square footage (in billions) of retailing space at shopping centers, the number (in thousands) of shopping centers, and the sales (in billions of dollars) for shopping centers for a recent 11-ear period are shown in the table. (Adapted from International Council of Shopping Centers) Sales, Total square footage, 1 Number of shopping centers, Shareholder s Equit The following table shows the net sales (in billions of dollars), total assets (in billions of dollars), and shareholder s equit (in billions of dollars) for Wal-Mart for a recent si-ear period. (Adapted from Wal-Mart Stores, Inc.) Shareholder s equit, Net sales, 1 Total assets, Etending Concepts r 2 Adjusted The calculation of r 2, the coefficient of determination, depends on the number of data pairs and the number of independent variables. An adjusted value of r 2 can be calculated, based on the number of degrees of freedom, as follows. radj 2 = 1 - c 11 - r2 21n - 12 d n - k - 1 where n is the number of data pairs and k is the number of independent variables. In Eercises 7 and 8, after calculating determine the percentage of the variation in that can be eplained b the relationships between variables according to radj. 2 Compare this result with the one obtained using r 2. r 2 adj r 2 adj, 7. Calculate for the data in Eercise Calculate radj 2 for the data in Eercise 6.

50 Uses & Abuses Statistics in the Real World Uses Correlation and Regression Correlation and regression analsis can be used to determine whether there is a significant relationship between two variables. If there is, ou can use one of the variables to predict the value of the other variable. For eample, educators have used correlation and regression analsis to determine that there is a significant correlation between a student s SAT score and the grade point average from a student s freshman ear at college. Consequentl, man colleges and universities use SAT scores of high school applicants as a predictor of the applicant s initial success at college. Abuses Confusing Correlation and Causation The most common abuse of correlation in studies is to confuse the concepts of correlation with those of causation (see page 506). Good SAT scores do not cause good college grades. Rather, there are other variables, such as good stud habits and motivation, that contribute to both. When a strong correlation is found between two variables, look for other variables that are correlated with both. Considering Onl Linear Correlation The correlation studied in this chapter is linear correlation. When the correlation coefficient is close to 1 or close to -1, the data points can be modeled b a straight line. It is possible that a correlation coefficient is close to zero but there is still a strong correlation of a different tpe. Consider the data listed in the table at the left. The value of the correlation coefficient is 0; however, the data are perfectl correlated with the equation = 1 as shown in the graph. Ethics When data are collected, all of the data should be used when calculating statistics. In this chapter, ou learned that before finding the equation of a regression line, it is helpful to construct a scatter plot of the data to check for outliers, gaps, and clusters in the data. A researcher cannot use onl those data points that fit his or her hpothesis or those that show a significant correlation. Although eliminating outliers ma help a data set coincide with predicted patterns or fit a regression line, it is unethical to amend data in such a wa. An outlier or an other point that influences a regression model can be removed onl if it is properl justified. In most cases, the best and sometimes safest approach for presenting statistical measurements is with and without an outlier being included. B doing this, the decision as to whether or not to recognize the outlier is left to the reader. EXERCISES 1. Confusing Correlation and Causation Find an eample of an article in a newspaper or magazine that confuses correlation and causation. Discuss other variables that could contribute to the relationship between the variables. 2. Considering Onl Linear Correlation Find an eample of two real-life variables that have a nonlinear correlation. 541

51 542 C H A P T E R 9 CORRELATION AND REGRESSION 9 CHAPTER SUMMARY What did ou learn? Section 9.1 EXAMPLE(S) REVIEW EXERCISES How to construct a scatter plot How to find a correlation coefficient ng - 1g21g2 r = 2ng 2-1g2 2 2ng 2-1g2 2 4, How to perform a hpothesis test for a population correlation coefficient r r t = 1 - r 2 B n - 2 Section 9.2 How to find the equation of a regression line, n = m + b 1, m = b = - m = g n ng - 1g21g2 ng 2-1g2 2 - m g n How to predict -values using a regression equation Section 9.3 How to find and interpret the coefficient of determination r How to find and interpret the standard error of estimate for a regression line 2 23, 24 s e = A g1 i - n i 2 2 n - 2 = A g 2 - bg - mg n - 2 How to construct and interpret a prediction interval for, n - E 6 6 n + E E = t c s e A n + n ng 2-1g2 2 Section 9.4 How to use technolog to find a multiple regression equation, the standard error of estimate, and the coefficient of determination 1 31, 32 How to use a multiple regression equation to predict -values 2 33, 34 n = b + m m m Á + m k k

52 REVIEW EXERCISES REVIEW EXERCISES Section 9.1 In Eercises 1 4, displa the data in a scatter plot. Then find the sample correlation coefficient r. Determine whether there is a positive linear correlation, a negative linear correlation, or no linear correlation between the variables. What can ou conclude? 1. The ages of eight cows (in ears) and their milk production (in gallons) per week Age, Milk production, The number of convertibles sold at a car dealership in a month and the average monthl temperature (in degrees Fahrenheit) Convertibles sold, Average monthl temperature, The IQ and brain size, as measured b the total piel count (in thousands) from an MRI scan, for nine female college students (Adapted from Intelligence) IQ, Piel count, The annual per capita sugar consumption (in kilograms) and the average number of cavities of 11- and 12-ear-old children in seven countries Sugar consumption, Cavities, In Eercises 5 and 6, use the given sample statistics to test the claim about the population correlation coefficient r at the indicated level of significance a. 5. Claim: r Z 0, a = Sample statistics: r = 0.24, n = Claim: r Z 0, a = Sample statistics: r = -0.55, n = 22 In Eercises 7 10, test the claim about the population correlation coefficient r at the indicated level of significance a. Then interpret the decision in the contet of the original claim. 7. Refer to the data in Eercise 1. At a = 0.05, test the claim that there is a linear correlation between a cow s age and milk production. 8. Refer to the data in Eercise 2. Is there enough evidence to conclude that there is a linear correlation between the number of convertibles sold and the average monthl temperature? Use a = 0.05.

53 544 C H A P T E R 9 CORRELATION AND REGRESSION 9. Refer to the data in Eercise 3. At a = 0.01, test the claim that there is a linear correlation between a female college student s IQ and brain size. 10. Refer to the data in Eercise 4. Is there enough evidence to conclude that there is a linear correlation between sugar consumption and tooth deca? Use a = Section 9.2 In Eercises 11 14, find the equation of the regression line for the given data. Then construct a scatter plot of the data and draw the regression line. Can ou make a guess about the sign and magnitude of r? Calculate r and check our guess. If convenient, use technolog to solve the problem. 11. The heights (in inches) of adult brothers and sisters from 11 families Sister, Brother, The number of households (in millions) with multiple television sets and the average time (in minutes) per da spent watching television per household for nine ears (Adapted from Nielsen Media Research) Households, Time, The ages (in ears) and the number of hours of sleep in one night of seven adults Age, Hours of sleep, The engine displacement (in cubic inches) and the fuel efficienc (in miles per gallon) of seven automobiles Displacement, Fuel efficienc, In Eercises 15 18, use the regression equations found in Eercises to predict the value of for each value of, if meaningful. If not, eplain wh not. (Correlation between and in Eercises is significant for a = ) 15. Refer to Eercise 11. What height would ou predict for a male whose sister is (a) 61 inches? (b) 66 inches? (c) 75 inches? (d) 50 inches? 16. Refer to Eercise 12. What average time per da spent watching television per household would ou predict for (a) 60 million households with multiple television sets? (b) 85 million households with multiple television sets? (c) 75 million households with multiple television sets? (d) 97 million households with multiple television sets? 17. Refer to Eercise 13. How man hours of sleep would ou predict for an adult of age (a) 18 ears? (b) 25 ears? (c) 85 ears? (d) 50 ears?

54 REVIEW EXERCISES Refer to Eercise 14. What fuel efficienc rating would ou predict for a car with an engine displacement of (a) 86 cubic inches? (b) 198 cubic inches? (c) 289 cubic inches? (d) 407 cubic inches? Section 9.3 In Eercises 19 22, use the value of the linear correlation coefficient r to find the coefficient of determination. Interpret the result. 19. r = r = r = r = In Eercises 23 and 24, use the data to find the (a) coefficient of determination r 2 and interpret the result, and (b) standard error of estimate s e and interpret the result. 23. The table shows the area of eight living spaces (in square feet) and the cooling capacit (in Btu per hour) of the air conditioners used in those spaces. The regression equation is n = (Adapted from Consumer Reports) Living area, Cooling capacit, 10, The table shows the cooking areas (in square inches) of 18 gas grills and their prices (in dollars).the regression equation is n = (Source: Lowe s) Area, Price, Area, Price, In Eercises 25 30, construct the indicated prediction interval and interpret the results. 25. Construct a 90% prediction interval for the height of a male in Eercise 11 whose sister is 64 inches tall. 26. Construct a 90% prediction interval for the average time per da spent watching television per household in Eercise 12 when 74 million households have multiple television sets Construct a 95% prediction interval for the number of hours of sleep for an adult in Eercise 13 who is 45 ears old. 28. Construct a 95% prediction interval for the fuel efficienc of an automobile in Eercise 14 that has an engine displacement of 265 cubic inches.

55 546 C H A P T E R 9 CORRELATION AND REGRESSION 29. Construct a 99% prediction interval for the cooling capacit of an air conditioner in Eercise 23 that is used in a living area of 720 square feet. 30. Construct a 99% prediction interval for the price of a gas grill in Eercise 24 with a usable cooking area of 900 square inches. Section 9.4 In Eercises 31 and 32, use the data in the table, which shows the carbon monoide, tar, and nicotine content, all in milligrams, of 14 brands of U.S. cigarettes. (Source: Federal Trade Commission) Carbon monoide, Tar, 1 Nicotine, Use technolog to find the multiple regression equation from the data. 32. Find the standard error of estimate and the coefficient of determination r 2. What percentage of the variation of can be eplained b the regression equation? In Eercises 33 and 34, use the multiple regression equation to predict the -values for the given values of the independent variables. 33. An equation that can be used to predict fuel econom (in miles per gallon) for automobiles is n = , where 1 is the engine displacement (in cubic inches) and 2 is the vehicle weight (in pounds). (a) 1 = 305, 2 = 3750 (b) 1 = 225, 2 = 3100 (c) 1 = 105, 2 = 2200 (d) 1 = 185, 2 = Use the regression equation found in Eercise 31. (a) 1 = 10, 2 = 0.7 (b) 1 = 15, 2 = 1.1 (c) 1 = 13, 2 = 0.8 (d) 1 = 9, 2 = 0.8 s e

56 CHAPTER QUIZ CHAPTER QUIZ Take this quiz as ou would take a quiz in class. After ou are done, check our work against the answers given in the back of the book. For Eercises 1 8, use the data in the table, which shows the personal income and outlas (both in trillions of dollars) for Americans for 11 recent ears. (Adapted from U.S. Commerce Department, Bureau of Economic Analsis) Personal income, Personal outlas, Construct a scatter plot for the data. Do the data appear to have a positive linear correlation, a negative linear correlation, or no linear correlation? Eplain. 2. Calculate the correlation coefficient r. What can ou conclude? 3. Test the level of significance of the correlation coefficient r. Use a = Find the equation of the regression line for the data. Draw the regression line on the scatter plot. 5. Use the regression equation to predict the personal outlas when the personal income is 6.2 trillion dollars. 6. Find the coefficient of determination r and interpret the results. 7. Find the standard error of estimate s e and interpret the results. 8. Construct a 95% prediction interval for personal outlas when personal income is 7.6 trillion dollars. Interpret the results. 9. The equation used to predict sunflower ield (in pounds) is n = where 1 is the number of acres planted (in thousands) and 2 is the number of acres harvested (in thousands). Use the regression equation to predict the -values for the given values of the independent variables listed below. Then determine which variable has a greater influence on the value of. (Source: U.S. National Agricultural Statistics Service) (a) 1 = 2643, 2 = 2518 (b) 1 = 3506, 2 = 3383 (c) 1 = 1894, 2 = 1726 (d) 1 = 2344, 2 =

57 Putting It All Together REAL Statistics Real Decisions The Center for Infant Development, part of the Department of Pscholog at the Universit of Denver, is a research center that focuses on perceptual and cognitive development in infants from two months to three ears of age. Graduate students, postdoctoral fellows, and members of the research staff can perform studies to see how infants think and behave under particular circumstances. You are a member of the research staff at the Center for Infant Development at the Universit of Denver. You are interested in determining whether babies take longer to learn how to crawl during the cold winter months than during the warmer summer months. You decide to record the age (in weeks) of each child when the child is able to crawl and the temperature (in degrees Fahrenheit) when the child crawls. You want to determine if there is a significant correlation between the mean crawling age and the mean temperature. Eercises 1. Analzing the Data (a) The data in the table at the right show the mean temperature (in degrees Fahrenheit) for each month and the mean age (in weeks) of the infants that began to crawl during that month. Construct a scatter plot of the data. Use the graph to determine whether there is a positive linear correlation, a negative linear correlation, or no linear correlation between the mean crawling age and the mean monthl temperature. (b) Calculate the correlation coefficient r and verif our conclusion in part (a). (c) Test the significance of the correlation coefficient found in part (b). Use a = (d) Find the equation of the regression line for the temperature and the mean crawling age. Add the graph of the regression line to our scatter plot in part (a). Does the regression line appear to be a good fit? (e) Can ou use the equation of the regression line to predict the mean crawling age given the temperature? Wh or wh not? (f) Find the coefficient of determination r 2 and the standard error of estimate s e and interpret the results. 2. What Do You Think? According to our analsis, would ou agree or disagree with the following? Wh or wh not? Because there is a significant linear correlation between mean monthl temperature and the mean crawling age, parents can consider the temperature when determining when their bab will crawl. Mean monthl temperature, Mean crawling age, (Source: Janette Benson Infant Behavior and Development)

CORRELATION AND REGRESSION

CORRELATION AND REGRESSION 9 CHAPTER CORRELATION AND REGRESSION 9.1 Correlation ACTIVITY 9.2 Linear Regression ACTIVITY CASE STUDY 9.3 Measures of Regression and Prediction Intervals 9.4 Multiple Regression USES AND ABUSES REAL

More information

Chapter 11. Correlation and Regression

Chapter 11. Correlation and Regression Chapter 11 Correlation and Regression Correlation A relationship between two variables. The data can be represented b ordered pairs (, ) is the independent (or eplanator) variable is the dependent (or

More information

Historical Note. Regression. Line of Best Fit

Historical Note. Regression. Line of Best Fit 11 4 Regression Objective 4. Compute the equation of the regression line. In studing relationships between two variables, collect the data and then construct a scatter plot. The purpose of the scatter

More information

Introduction Direct Variation Rates of Change Scatter Plots. Introduction. EXAMPLE 1 A Mathematical Model

Introduction Direct Variation Rates of Change Scatter Plots. Introduction. EXAMPLE 1 A Mathematical Model APPENDIX B Mathematical Modeling B1 Appendi B Mathematical Modeling B.1 Modeling Data with Linear Functions Introduction Direct Variation Rates of Change Scatter Plots Introduction The primar objective

More information

Essential Question How can you use a scatter plot and a line of fit to make conclusions about data?

Essential Question How can you use a scatter plot and a line of fit to make conclusions about data? . Scatter Plots and Lines of Fit Essential Question How can ou use a scatter plot and a line of fit to make conclusions about data? A scatter plot is a graph that shows the relationship between two data

More information

1.7 Inverse Functions

1.7 Inverse Functions 71_0107.qd 1/7/0 10: AM Page 17 Section 1.7 Inverse Functions 17 1.7 Inverse Functions Inverse Functions Recall from Section 1. that a function can be represented b a set of ordered pairs. For instance,

More information

How can you construct and interpret a scatter plot? ACTIVITY: Constructing a Scatter Plot

How can you construct and interpret a scatter plot? ACTIVITY: Constructing a Scatter Plot 9. Scatter Plots How can ou construct and interpret a scatter plot? ACTIVITY: Constructing a Scatter Plot Work with a partner. The weights (in ounces) and circumferences C (in inches) of several sports

More information

3.2 Introduction to Functions

3.2 Introduction to Functions 8 CHAPTER Graphs and Functions Write each statement as an equation in two variables. Then graph each equation. 97. The -value is more than three times the -value. 98. The -value is - decreased b twice

More information

Linear and Nonlinear Systems of Equations. The Method of Substitution. Equation 1 Equation 2. Check (2, 1) in Equation 1 and Equation 2: 2x y 5?

Linear and Nonlinear Systems of Equations. The Method of Substitution. Equation 1 Equation 2. Check (2, 1) in Equation 1 and Equation 2: 2x y 5? 3330_070.qd 96 /5/05 Chapter 7 7. 9:39 AM Page 96 Sstems of Equations and Inequalities Linear and Nonlinear Sstems of Equations What ou should learn Use the method of substitution to solve sstems of linear

More information

Algebra II Foundations

Algebra II Foundations Algebra II Foundations Non Linear Functions Student Journal Problems of the Da First Semester Page 35 Problem Set 35 CHALLENGE Tr the following problem, and eplain how ou determined our answer. If it takes

More information

6.4 graphs OF logarithmic FUnCTIOnS

6.4 graphs OF logarithmic FUnCTIOnS SECTION 6. graphs of logarithmic functions 9 9 learning ObjeCTIveS In this section, ou will: Identif the domain of a logarithmic function. Graph logarithmic functions. 6. graphs OF logarithmic FUnCTIOnS

More information

Chapter 10. Correlation and Regression. McGraw-Hill, Bluman, 7th ed., Chapter 10 1

Chapter 10. Correlation and Regression. McGraw-Hill, Bluman, 7th ed., Chapter 10 1 Chapter 10 Correlation and Regression McGraw-Hill, Bluman, 7th ed., Chapter 10 1 Example 10-2: Absences/Final Grades Please enter the data below in L1 and L2. The data appears on page 537 of your textbook.

More information

Analyzing Lines of Fit

Analyzing Lines of Fit 4.5 Analyzing Lines of Fit Essential Question How can you analytically find a line of best fit for a scatter plot? Finding a Line of Best Fit Work with a partner. The scatter plot shows the median ages

More information

Maintaining Mathematical Proficiency

Maintaining Mathematical Proficiency Name Date Chapter 5 Maintaining Mathematical Proficienc Graph the equation. 1. + =. = 3 3. 5 + = 10. 3 = 5. 3 = 6. 3 + = 1 Solve the inequalit. Graph the solution. 7. a 3 > 8. c 9. d 5 < 3 10. 8 3r 5 r

More information

P.4 Lines in the Plane

P.4 Lines in the Plane 28 CHAPTER P Prerequisites P.4 Lines in the Plane What ou ll learn about Slope of a Line Point-Slope Form Equation of a Line Slope-Intercept Form Equation of a Line Graphing Linear Equations in Two Variables

More information

determine whether or not this relationship is.

determine whether or not this relationship is. Section 9-1 Correlation A correlation is a between two. The data can be represented by ordered pairs (x,y) where x is the (or ) variable and y is the (or ) variable. There are several types of correlations

More information

74 Maths Quest 10 for Victoria

74 Maths Quest 10 for Victoria Linear graphs Maria is working in the kitchen making some high energ biscuits using peanuts and chocolate chips. She wants to make less than g of biscuits but wants the biscuits to contain at least 8 g

More information

Chapter 6. Exploring Data: Relationships

Chapter 6. Exploring Data: Relationships Chapter 6 Exploring Data: Relationships For All Practical Purposes: Effective Teaching A characteristic of an effective instructor is fairness and consistenc in grading and evaluating student performance.

More information

Functions. Essential Question What are some of the characteristics of the graph of an exponential function? ) x e. f (x) = ( 1 3 ) x f.

Functions. Essential Question What are some of the characteristics of the graph of an exponential function? ) x e. f (x) = ( 1 3 ) x f. 7. TEXAS ESSENTIAL KNOWLEDGE AND SKILLS A..A Eponential Growth and Deca Functions Essential Question What are some of the characteristics of the graph of an eponential function? You can use a graphing

More information

Data transformation. Core: Data analysis. Chapter 5

Data transformation. Core: Data analysis. Chapter 5 Chapter 5 5 Core: Data analsis Data transformation ISBN 978--7-56757-3 Jones et al. 6 66 Core Chapter 5 Data transformation 5A Introduction You first encountered data transformation in Chapter where ou

More information

Ready To Go On? Skills Intervention 2-1 Solving Linear Equations and Inequalities

Ready To Go On? Skills Intervention 2-1 Solving Linear Equations and Inequalities A Read To Go n? Skills Intervention -1 Solving Linear Equations and Inequalities Find these vocabular words in Lesson -1 and the Multilingual Glossar. Vocabular equation solution of an equation linear

More information

Graphing Linear Equations

Graphing Linear Equations 1-3 BJECTIVES Graph linear equations. Find the - and -intercepts of a line. Find the slope of a line through two points. Find zeros of linear functions. Graphing Linear Equations AGRICULTURE American farmers

More information

Exponential and Logarithmic Functions

Exponential and Logarithmic Functions Eponential and Logarithmic Functions 6 Figure Electron micrograph of E. Coli bacteria (credit: Mattosaurus, Wikimedia Commons) CHAPTER OUTLINE 6. Eponential Functions 6. Logarithmic Properties 6. Graphs

More information

Chapter 10. Correlation and Regression. McGraw-Hill, Bluman, 7th ed., Chapter 10 1

Chapter 10. Correlation and Regression. McGraw-Hill, Bluman, 7th ed., Chapter 10 1 Chapter 10 Correlation and Regression McGraw-Hill, Bluman, 7th ed., Chapter 10 1 Chapter 10 Overview Introduction 10-1 Scatter Plots and Correlation 10- Regression 10-3 Coefficient of Determination and

More information

Chapter 13 Student Lecture Notes Department of Quantitative Methods & Information Systems. Business Statistics

Chapter 13 Student Lecture Notes Department of Quantitative Methods & Information Systems. Business Statistics Chapter 13 Student Lecture Notes 13-1 Department of Quantitative Methods & Information Sstems Business Statistics Chapter 14 Introduction to Linear Regression and Correlation Analsis QMIS 0 Dr. Mohammad

More information

Solve each system by graphing. Check your solution. y =-3x x + y = 5 y =-7

Solve each system by graphing. Check your solution. y =-3x x + y = 5 y =-7 Practice Solving Sstems b Graphing Solve each sstem b graphing. Check our solution. 1. =- + 3 = - (1, ). = 1 - (, 1) =-3 + 5 3. = 3 + + = 1 (, 3). =-5 = - 7. = 3-5 3 - = 0 (1, 5) 5. -3 + = 5 =-7 (, 7).

More information

Linear Functions. Essential Question How can you determine whether a function is linear or nonlinear?

Linear Functions. Essential Question How can you determine whether a function is linear or nonlinear? . Linear Functions Essential Question How can ou determine whether a function is linear or nonlinear? Finding Patterns for Similar Figures Work with a partner. Cop and complete each table for the sequence

More information

4.4 Scatter Plots and Lines of Fit 4.5 Analyzing Lines of Fit 4.6 Arithmetic Sequences 4.7 Piecewise Functions

4.4 Scatter Plots and Lines of Fit 4.5 Analyzing Lines of Fit 4.6 Arithmetic Sequences 4.7 Piecewise Functions Writing Linear Functions. Writing Equations in Slope-Intercept Form. Writing Equations in Point-Slope Form.3 Writing Equations of Parallel and Perpendicular Lines. Scatter Plots and Lines of Fit.5 Analzing

More information

Fair Game Review. Chapter 2. and y = 5. Evaluate the expression when x = xy 2. 4x. Evaluate the expression when a = 9 and b = 4.

Fair Game Review. Chapter 2. and y = 5. Evaluate the expression when x = xy 2. 4x. Evaluate the expression when a = 9 and b = 4. Name Date Chapter Fair Game Review Evaluate the epression when = and =.... 0 +. 8( ) Evaluate the epression when a = 9 and b =.. ab. a ( b + ) 7. b b 7 8. 7b + ( ab ) 9. You go to the movies with five

More information

Practice A ( 1, 3 ( 0, 1. Match the function with its graph. 3 x. Explain how the graph of g can be obtained from the graph of f. 5 x.

Practice A ( 1, 3 ( 0, 1. Match the function with its graph. 3 x. Explain how the graph of g can be obtained from the graph of f. 5 x. 8. Practice A For use with pages 65 7 Match the function with its graph.. f. f.. f 5. f 6. f f Lesson 8. A. B. C. (, 6) (0, ) (, ) (0, ) ( 0, ) (, ) D. E. F. (0, ) (, 6) ( 0, ) (, ) (, ) (0, ) Eplain how

More information

Lecture 14. Analysis of Variance * Correlation and Regression. The McGraw-Hill Companies, Inc., 2000

Lecture 14. Analysis of Variance * Correlation and Regression. The McGraw-Hill Companies, Inc., 2000 Lecture 14 Analysis of Variance * Correlation and Regression Outline Analysis of Variance (ANOVA) 11-1 Introduction 11-2 Scatter Plots 11-3 Correlation 11-4 Regression Outline 11-5 Coefficient of Determination

More information

Lecture 14. Outline. Outline. Analysis of Variance * Correlation and Regression Analysis of Variance (ANOVA)

Lecture 14. Outline. Outline. Analysis of Variance * Correlation and Regression Analysis of Variance (ANOVA) Outline Lecture 14 Analysis of Variance * Correlation and Regression Analysis of Variance (ANOVA) 11-1 Introduction 11- Scatter Plots 11-3 Correlation 11-4 Regression Outline 11-5 Coefficient of Determination

More information

Topic - 12 Linear Regression and Correlation

Topic - 12 Linear Regression and Correlation Topic 1 Linear Regression and Correlation Correlation & Regression Univariate & Bivariate tatistics U: frequenc distribution, mean, mode, range, standard deviation B: correlation two variables Correlation

More information

Functions. Introduction

Functions. Introduction Functions,00 P,000 00 0 70 7 80 8 0 000 00 00 Figure Standard and Poor s Inde with dividends reinvested (credit "bull": modification of work b Praitno Hadinata; credit "graph": modification of work b MeasuringWorth)

More information

Lesson Remember. Finding Domain and Range from a Graph EXAMPLE. Key Vocabulary

Lesson Remember. Finding Domain and Range from a Graph EXAMPLE. Key Vocabulary 0. Lesson Ke Vocabular function domain range function form Functions A function is a relationship that pairs each input with eactl one output. The domain is the set of all possible input values. The range

More information

Fair Game Review. Chapter = How many calculators are sold when the profit is $425? Solve the equation. Check your solution.

Fair Game Review. Chapter = How many calculators are sold when the profit is $425? Solve the equation. Check your solution. Name Date Chapter 4 Fair Game Review Solve the equation. Check our solution.. 8 3 = 3 2. 4a + a = 2 3. 9 = 4( 3k 4) 7k 4. ( m) 2 5 6 2 = 8 5. 5 t + 8t = 3 6. 3 5h 2 h + 4 = 0 2 7. The profit P (in dollars)

More information

ONLINE PAGE PROOFS. Relationships between two numerical variables

ONLINE PAGE PROOFS. Relationships between two numerical variables 14 Relationships between two numerical variables 14.1 Kick off with CAS 14.2 Scatterplots and basic correlation 14.3 Further correlation coefficients 14.4 Making predictions 14.5 Review 14.1 Kick off with

More information

3.2 Understanding Relations and Functions-NOTES

3.2 Understanding Relations and Functions-NOTES Name Class Date. Understanding Relations and Functions-NOTES Essential Question: How do ou represent relations and functions? Eplore A1.1.A decide whether relations represented verball, tabularl, graphicall,

More information

MATH 021 UNIT 1 HOMEWORK ASSIGNMENTS

MATH 021 UNIT 1 HOMEWORK ASSIGNMENTS MATH 01 UNIT 1 HOMEWORK ASSIGNMENTS General Instructions You will notice that most of the homework assignments for a section have more than one part. Usuall, the part (A) questions ask for eplanations,

More information

1Write and graph. 2Solve problems. Now. Then. Why? New Vocabulary

1Write and graph. 2Solve problems. Now. Then. Why? New Vocabulary Direct Variation Then You found rates of change of linear functions. (Lesson -) Now Write and graph direct variation equations. Solve problems involving direct variation. Wh? Bianca is saving her mone

More information

Functions. Introduction CHAPTER OUTLINE

Functions. Introduction CHAPTER OUTLINE Functions,00 P,000 00 0 970 97 980 98 990 99 000 00 00 Figure Standard and Poor s Inde with dividends reinvested (credit "bull": modification of work b Praitno Hadinata; credit "graph": modification of

More information

1.5. Analyzing Graphs of Functions. The Graph of a Function. What you should learn. Why you should learn it. 54 Chapter 1 Functions and Their Graphs

1.5. Analyzing Graphs of Functions. The Graph of a Function. What you should learn. Why you should learn it. 54 Chapter 1 Functions and Their Graphs 0_005.qd /7/05 8: AM Page 5 5 Chapter Functions and Their Graphs.5 Analzing Graphs of Functions What ou should learn Use the Vertical Line Test for functions. Find the zeros of functions. Determine intervals

More information

2-3. Linear Regression and Correlation. Vocabulary

2-3. Linear Regression and Correlation. Vocabulary Chapter 2 Lesson 2-3 Linear Regression and Correlation BIG IDEA The regression line is the line of best fi t to data. The correlation coeffi cient measures the strength and direction of a linear pattern

More information

3.7 InveRSe FUnCTIOnS

3.7 InveRSe FUnCTIOnS CHAPTER functions learning ObjeCTIveS In this section, ou will: Verif inverse functions. Determine the domain and range of an inverse function, and restrict the domain of a function to make it one-to-one.

More information

Algebra 2 Chapter 2 Page 1

Algebra 2 Chapter 2 Page 1 Mileage (MPGs) Section. Relations and Functions. To graph a relation, state the domain and range, and determine if the relation is a function.. To find the values of a function for the given element of

More information

LESSON #24 - POWER FUNCTIONS COMMON CORE ALGEBRA II

LESSON #24 - POWER FUNCTIONS COMMON CORE ALGEBRA II 1 LESSON #4 - POWER FUNCTIONS COMMON CORE ALGEBRA II Before we start to analze polnomials of degree higher than two (quadratics), we first will look at ver simple functions known as power functions. The

More information

3.2 LOGARITHMIC FUNCTIONS AND THEIR GRAPHS

3.2 LOGARITHMIC FUNCTIONS AND THEIR GRAPHS Section. Logarithmic Functions and Their Graphs 7. LOGARITHMIC FUNCTIONS AND THEIR GRAPHS Ariel Skelle/Corbis What ou should learn Recognize and evaluate logarithmic functions with base a. Graph logarithmic

More information

15.2 Graphing Logarithmic

15.2 Graphing Logarithmic _ - - - - - - Locker LESSON 5. Graphing Logarithmic Functions Teas Math Standards The student is epected to: A.5.A Determine the effects on the ke attributes on the graphs of f () = b and f () = log b

More information

LESSON #11 - FORMS OF A LINE COMMON CORE ALGEBRA II

LESSON #11 - FORMS OF A LINE COMMON CORE ALGEBRA II LESSON # - FORMS OF A LINE COMMON CORE ALGEBRA II Linear functions come in a variet of forms. The two shown below have been introduced in Common Core Algebra I and Common Core Geometr. TWO COMMON FORMS

More information

The American School of Marrakesh. Algebra 2 Algebra 2 Summer Preparation Packet

The American School of Marrakesh. Algebra 2 Algebra 2 Summer Preparation Packet The American School of Marrakesh Algebra Algebra Summer Preparation Packet Summer 016 Algebra Summer Preparation Packet This summer packet contains eciting math problems designed to ensure our readiness

More information

What You ll Learn Identify direct variation. Use direct variation to solve problems.

What You ll Learn Identify direct variation. Use direct variation to solve problems. AM_S_C_L_3.indd Page // 3: PM s-user /Volumes//GO/CORE_READING/TENNESSEE/ANCILLARY... Proportionalit and Linear Relationships Teach the Concept Lesson - Direct Variation Interactive Stud Guide See pages

More information

Chapter 4. Chapter 4 Opener. Section 4.1. Big Ideas Math Blue Worked-Out Solutions. x 2. Try It Yourself (p. 147) x 0 1. y ( ) x 2

Chapter 4. Chapter 4 Opener. Section 4.1. Big Ideas Math Blue Worked-Out Solutions. x 2. Try It Yourself (p. 147) x 0 1. y ( ) x 2 Chapter Chapter Opener Tr It Yourself (p. 7). As the input decreases b, the output increases b.. Input As the input increases b, the output increases b.. As the input decreases b, the output decreases

More information

H.Algebra 2 Summer Review Packet

H.Algebra 2 Summer Review Packet H.Algebra Summer Review Packet 1 Correlation of Algebra Summer Packet with Algebra 1 Objectives A. Simplifing Polnomial Epressions Objectives: The student will be able to: Use the commutative, associative,

More information

c. Find the slope and y-intercept of the graph of the linear equation. Then sketch its graph.

c. Find the slope and y-intercept of the graph of the linear equation. Then sketch its graph. Name Solve. End-of-Course. 7 =. 5 c =. One cell phone plan charges $0 per month plus $0.5 per minute used. A second cell phone plan charges $5 per month plus $0.0 per minute used. Write and solve an equation

More information

x. 4. 2x 10 4x. 10 x

x. 4. 2x 10 4x. 10 x CCGPS UNIT Semester 1 COORDINATE ALGEBRA Page 1 of Reasoning with Equations and Quantities Name: Date: Understand solving equations as a process of reasoning and eplain the reasoning MCC9-1.A.REI.1 Eplain

More information

Unit 26 Solving Inequalities Inequalities on a Number Line Solution of Linear Inequalities (Inequations)

Unit 26 Solving Inequalities Inequalities on a Number Line Solution of Linear Inequalities (Inequations) UNIT Solving Inequalities: Student Tet Contents STRAND G: Algebra Unit Solving Inequalities Student Tet Contents Section. Inequalities on a Number Line. of Linear Inequalities (Inequations). Inequalities

More information

Unit #2: Linear and Exponential Functions Lesson #13: Linear & Exponential Regression, Correlation, & Causation. Day #1

Unit #2: Linear and Exponential Functions Lesson #13: Linear & Exponential Regression, Correlation, & Causation. Day #1 Algebra I Name Unit #2: Linear and Exponential Functions Lesson #13: Linear & Exponential Regression, Correlation, & Causation Day #1 Period Date When a table of values increases or decreases by the same

More information

Functions. Introduction

Functions. Introduction Functions,00 P,000 00 0 970 97 980 98 990 99 000 00 00 Figure Standard and Poor s Inde with dividends reinvested (credit "bull": modification of work b Praitno Hadinata; credit "graph": modification of

More information

DMA 50 Worksheet #1 Introduction to Graphs: Analyzing, Interpreting, and Creating Graphs

DMA 50 Worksheet #1 Introduction to Graphs: Analyzing, Interpreting, and Creating Graphs DMA 0 Worksheet #1 Introduction to Graphs: Analzing, Interpreting, and Creating Graphs A graph will be given followed b a set of questions to answer. Show our work. The bar graph below shows the number

More information

Chapter 2 Linear Relations and Functions

Chapter 2 Linear Relations and Functions Chapter Linear Relations and Functions I. Relations and Functions A. Definitions 1. Relation. Domain the variable ( ) 3. Range the variable ( ). Function a) A relationship between ( ) and ( ). b) The output

More information

Exponential and Logarithmic Functions

Exponential and Logarithmic Functions Eponential and Logarithmic Functions.1 Eponential Growth and Deca Functions. The Natural Base e.3 Logarithms and Logarithmic Functions. Transformations of Eponential and Logarithmic Functions.5 Properties

More information

LESSON #12 - FORMS OF A LINE COMMON CORE ALGEBRA II

LESSON #12 - FORMS OF A LINE COMMON CORE ALGEBRA II LESSON # - FORMS OF A LINE COMMON CORE ALGEBRA II Linear functions come in a variet of forms. The two shown below have been introduced in Common Core Algebra I and Common Core Geometr. TWO COMMON FORMS

More information

Maintaining Mathematical Proficiency

Maintaining Mathematical Proficiency Chapter Maintaining Mathematical Proficienc Find the -intercept of the graph of the linear equation. 1. = + 3. = 3 + 5 3. = 10 75. = ( 9) 5. 7( 10) = +. 5 + 15 = 0 Find the distance between the two points.

More information

Domain, Range, and End Behavior

Domain, Range, and End Behavior Locker LESSON 1.1 Domain, Range, and End Behavior Common Core Math Standards The student is epected to: F-IF.5 Relate the domain of a function to its graph and, where applicable, to the quantitative relationship

More information

Unit 4 Relations and Functions. 4.1 An Overview of Relations and Functions. January 26, Smart Board Notes Unit 4.notebook

Unit 4 Relations and Functions. 4.1 An Overview of Relations and Functions. January 26, Smart Board Notes Unit 4.notebook Unit 4 Relations and Functions 4.1 An Overview of Relations and Functions Jan 26 5:56 PM Jan 26 6:25 PM A Relation associates the elements of one set of objects with the elements of another set. Relations

More information

LESSON #28 - POWER FUNCTIONS COMMON CORE ALGEBRA II

LESSON #28 - POWER FUNCTIONS COMMON CORE ALGEBRA II 1 LESSON #8 - POWER FUNCTIONS COMMON CORE ALGEBRA II Before we start to analze polnomials of degree higher than two (quadratics), we first will look at ver simple functions known as power functions. The

More information

Chapter 5: Systems of Equations

Chapter 5: Systems of Equations Chapter : Sstems of Equations Section.: Sstems in Two Variables... 0 Section. Eercises... 9 Section.: Sstems in Three Variables... Section. Eercises... Section.: Linear Inequalities... Section.: Eercises.

More information

Modeling with Exponential and Logarithmic Functions 6.7. Essential Question How can you recognize polynomial, exponential, and logarithmic models?

Modeling with Exponential and Logarithmic Functions 6.7. Essential Question How can you recognize polynomial, exponential, and logarithmic models? .7 Modeling with Eponential and Logarithmic Functions Essential Question How can ou recognize polnomial, eponential, and logarithmic models? Recognizing Different Tpes of Models Work with a partner. Match

More information

Ch 13 & 14 - Regression Analysis

Ch 13 & 14 - Regression Analysis Ch 3 & 4 - Regression Analysis Simple Regression Model I. Multiple Choice:. A simple regression is a regression model that contains a. only one independent variable b. only one dependent variable c. more

More information

A calculator may be used on the exam.

A calculator may be used on the exam. The Algebra Semester A eamination has the following tpes of questions: Selected Response Student Produced Response (Grid-in) Brief Constructed Response (BCR) Etended Constructed Response (ECR) Short Answer

More information

Maintaining Mathematical Proficiency

Maintaining Mathematical Proficiency Name Date Chapter 3 Maintaining Mathematical Proficienc Plot the point in a coordinate plane. Describe the location of the point. 1. A( 3, 1). B (, ) 3. C ( 1, 0). D ( 5, ) 5. Plot the point that is on

More information

TOPIC ESSENTIAL QUESTION

TOPIC ESSENTIAL QUESTION TOPIC Topic Review TOPIC ESSENTIAL QUESTION 1. How can linear functions be used to model situations and solve problems? Vocabular Review Choose the correct term to complete each sentence.. When tends to

More information

Learning Objective: We will construct and interpret scatterplots (G8M6L4)

Learning Objective: We will construct and interpret scatterplots (G8M6L4) Learning Objective: We will construct and interpret scatterplots (G8ML) Concept Development: A Scatter Plot is a graph of numerical data on two variables. Eamples: -- The number of hours ou stud for a

More information

Exam 3 Practice Questions Psych , Fall 9

Exam 3 Practice Questions Psych , Fall 9 Vocabular Eam 3 Practice Questions Psch 3101-100, Fall 9 Rather than choosing some practice terms at random, I suggest ou go through all the terms in the vocabular lists. The real eam will ask for definitions

More information

Chapter 9. Correlation and Regression

Chapter 9. Correlation and Regression Chapter 9 Correlation and Regression Lesson 9-1/9-2, Part 1 Correlation Registered Florida Pleasure Crafts and Watercraft Related Manatee Deaths 100 80 60 40 20 0 1991 1993 1995 1997 1999 Year Boats in

More information

Essential Question How can you use a quadratic function to model a real-life situation?

Essential Question How can you use a quadratic function to model a real-life situation? 3. TEXAS ESSENTIAL KNOWLEDGE AND SKILLS A..A A..E A..A A..B A..C Modeling with Quadratic Functions Essential Question How can ou use a quadratic function to model a real-life situation? Work with a partner.

More information

Review Topics for MATH 1400 Elements of Calculus Table of Contents

Review Topics for MATH 1400 Elements of Calculus Table of Contents Math 1400 - Mano Table of Contents - Review - page 1 of 2 Review Topics for MATH 1400 Elements of Calculus Table of Contents MATH 1400 Elements of Calculus is one of the Marquette Core Courses for Mathematical

More information

ACTIVITY: Comparing Types of Decay

ACTIVITY: Comparing Types of Decay 6.6 Eponential Deca eponential deca? What are the characteristics of 1 ACTIVITY: Comparing Tpes of Deca Work with a partner. Describe the pattern of deca for each sequence and graph. Which of the patterns

More information

11.1 Inverses of Simple Quadratic and Cubic Functions

11.1 Inverses of Simple Quadratic and Cubic Functions Locker LESSON 11.1 Inverses of Simple Quadratic and Cubic Functions Teas Math Standards The student is epected to: A..B Graph and write the inverse of a function using notation such as f (). Also A..A,

More information

Mathematical Notation Math Introduction to Applied Statistics

Mathematical Notation Math Introduction to Applied Statistics Mathematical Notation Math 113 - Introduction to Applied Statistics Name : Use Word or WordPerfect to recreate the following documents. Each article is worth 10 points and can be printed and given to the

More information

Statistics 100 Exam 2 March 8, 2017

Statistics 100 Exam 2 March 8, 2017 STAT 100 EXAM 2 Spring 2017 (This page is worth 1 point. Graded on writing your name and net id clearly and circling section.) PRINT NAME (Last name) (First name) net ID CIRCLE SECTION please! L1 (MWF

More information

AP Statistics Unit 6 Note Packet Linear Regression. Scatterplots and Correlation

AP Statistics Unit 6 Note Packet Linear Regression. Scatterplots and Correlation Scatterplots and Correlation Name Hr A scatterplot shows the relationship between two quantitative variables measured on the same individuals. variable (y) measures an outcome of a study variable (x) may

More information

Lecture Guide. Math 42 - Elementary Algebra. Stephen Toner. Introductory Algebra, 3rd edition. Miller, O'Neill, Hyde. Victor Valley College

Lecture Guide. Math 42 - Elementary Algebra. Stephen Toner. Introductory Algebra, 3rd edition. Miller, O'Neill, Hyde. Victor Valley College Lecture Guide Math 42 - Elementar Algebra to accompan Introductor Algebra, 3rd edition Miller, O'Neill, Hde Prepared b Stephen Toner Victor Valle College Accompaning videos can be found at www.mathvideos.net.

More information

Writing Equations in Point-Slope Form

Writing Equations in Point-Slope Form . Writing Equations in Point-Slope Form Essential Question How can ou write an equation of a line when ou are given the slope and a point on the line? Writing Equations of Lines Work with a partner. Sketch

More information

CONSUMER CHOICES Madison is thinking about leasing a car for. Example 1 Solve the system of equations by graphing.

CONSUMER CHOICES Madison is thinking about leasing a car for. Example 1 Solve the system of equations by graphing. 2-1 BJECTIVES Solve sstems of equations graphicall. Solve sstems of equations algebraicall. Solving Sstems of Equations in Two Variables CNSUMER CHICES Madison is thinking about leasing a car for two ears.

More information

10.2 Graphing Exponential Functions

10.2 Graphing Exponential Functions Name Class Date 10. Graphing Eponential Functions Essential Question: How do ou graph an eponential function of the form f () = ab? Resource Locker Eplore Eploring Graphs of Eponential Functions Eponential

More information

Laurie s Notes. Overview of Section 3.5

Laurie s Notes. Overview of Section 3.5 Overview of Section.5 Introduction Sstems of linear equations were solved in Algebra using substitution, elimination, and graphing. These same techniques are applied to nonlinear sstems in this lesson.

More information

Analytic Geometry 300 UNIT 9 ANALYTIC GEOMETRY. An air traffi c controller uses algebra and geometry to help airplanes get from one point to another.

Analytic Geometry 300 UNIT 9 ANALYTIC GEOMETRY. An air traffi c controller uses algebra and geometry to help airplanes get from one point to another. UNIT 9 Analtic Geometr An air traffi c controller uses algebra and geometr to help airplanes get from one point to another. 00 UNIT 9 ANALYTIC GEOMETRY Copright 00, K Inc. All rights reserved. This material

More information

Steps to take to do the descriptive part of regression analysis:

Steps to take to do the descriptive part of regression analysis: STA 2023 Simple Linear Regression: Least Squares Model Steps to take to do the descriptive part of regression analysis: A. Plot the data on a scatter plot. Describe patterns: 1. Is there a strong, moderate,

More information

Bridge-Thickness Experiment. Student 2

Bridge-Thickness Experiment. Student 2 Applications 1. Below are some results from the bridge-thickness eperiment. Bridge-Thickness Eperiment Thickness (laers) Breaking Weight (pennies) 15 5 5 a. Plot the (thickness, breaking weight) data.

More information

Unit 3: Relations and Functions

Unit 3: Relations and Functions Unit 3: Relations and Functions 5-1: Binar Relations Binar Relation: - a set ordered pairs (coordinates) that include two variables (elements). (, ) = horizontal = vertical Domain: - all the -values (first

More information

Simple linear regression: linear relationship between two qunatitative variables. Linear Regression. The regression line

Simple linear regression: linear relationship between two qunatitative variables. Linear Regression. The regression line Linear Regression Simple linear regression: linear relationship etween two qunatitative variales The regression line Facts aout least-squares regression Residuals Influential oservations Cautions aout

More information

Correlation and Regression

Correlation and Regression A. The Basics of Correlation Analysis 1. SCATTER DIAGRAM A key tool in correlation analysis is the scatter diagram, which is a tool for analyzing potential relationships between two variables. One variable

More information

Algebra 1. Standard Create and Analyze Tables and Graphs. Categories Tables Graphs Context Problems Making Predictions

Algebra 1. Standard Create and Analyze Tables and Graphs. Categories Tables Graphs Context Problems Making Predictions Algebra Standard Create and Analze Tables and Graphs Categories Tables Graphs Contet Problems Making Predictions Summative Assessment Date: Wednesda, August 9 th Page Create and Analze Tables and Graphs

More information

a. In the statement "Height is a function of weight," which is the independent variable and which is the dependent variable?

a. In the statement Height is a function of weight, which is the independent variable and which is the dependent variable? 1. The weights and heights of si mathematics students are given in the table. Answer parts a through e. Weight (lb.) Height (cm) 157 19 11 155 1 11 175 17 157 15 17 17 a. In the statement "Height is a

More information

Correlation and Residuals

Correlation and Residuals Correlation and Residuals Having pets is not onl fun, but it can be good for our health, too. Studies show that there is a correlation between pet ownership and blood pressure, mood, and the abilit of

More information

Chapter 1: Linear Equations and Functions

Chapter 1: Linear Equations and Functions Chapter : Answers to Eercises Chapter : Linear Equations and Functions Eercise.. 7= 8+ 7+ 7 8 = 8+ + 7 8 = 9. + 8= 8( + ). + 8= 8+ 8 8 = 8 8 7 = 0 = 0 9 = = = () = 96 = 7. ( 7) = ( + ) 9.. = + + = + =

More information

y n 1 ( x i x )( y y i n 1 i y 2

y n 1 ( x i x )( y y i n 1 i y 2 STP3 Brief Class Notes Instructor: Ela Jackiewicz Chapter Regression and Correlation In this chapter we will explore the relationship between two quantitative variables, X an Y. We will consider n ordered

More information

Lesson Master 6-1B. USES Objective E. Questions on SPUR Objectives. In 1 5, use the chart showing the percent of households that had a computer.

Lesson Master 6-1B. USES Objective E. Questions on SPUR Objectives. In 1 5, use the chart showing the percent of households that had a computer. Back to Lesson 6-1 6-1B USES Objective E In 1 5, use the chart showing the percent of households that had a computer. Year 1989 1993 1997 2001 Percent of Households 15.0 22.8 36.6 56.3 1. Make a line graph

More information

Chapter 4 ( 2, 2 ). Chapter 4 Opener. Section 4.1. Big Ideas Math Blue Worked-Out Solutions. Try It Yourself (p. 141) = = = = 15. The solution checks.

Chapter 4 ( 2, 2 ). Chapter 4 Opener. Section 4.1. Big Ideas Math Blue Worked-Out Solutions. Try It Yourself (p. 141) = = = = 15. The solution checks. Chapter Chapter Opener Tr It Yourself (p. ). ab = ( ) = ( ) = = = =. a b = ( ). a b = = = = + = + = + 9 =, or. a ( b a ). Point Q is on the -ais, units up from the origin. So, the -coordinate is, and the

More information