IT 403 Practice Problems (2-2) Answers

Similar documents
AP Statistics Unit 2 (Chapters 7-10) Warm-Ups: Part 1

Chapter 3: Examining Relationships

Section I: Multiple Choice Select the best answer for each question.

Unit 6 - Introduction to linear regression

Chapter 9 - Correlation and Regression

SECTION I Number of Questions 42 Percent of Total Grade 50

Chapter 14. Statistical versus Deterministic Relationships. Distance versus Speed. Describing Relationships: Scatterplots and Correlation

Chapter Goals. To understand the methods for displaying and describing relationship among variables. Formulate Theories.

Examining Relationships. Chapter 3

Simple Linear Regression Using Ordinary Least Squares

The response variable depends on the explanatory variable.

Unit 6 - Simple linear regression

Mrs. Poyner/Mr. Page Chapter 3 page 1

Linear Regression Communication, skills, and understanding Calculator Use

Simple Linear Regression

Determine is the equation of the LSRL. Determine is the equation of the LSRL of Customers in line and seconds to check out.. Chapter 3, Section 2

LAB 3 INSTRUCTIONS SIMPLE LINEAR REGRESSION

Describing the Relationship between Two Variables

Univariate analysis. Simple and Multiple Regression. Univariate analysis. Simple Regression How best to summarise the data?

Relationships Regression

Example: Can an increase in non-exercise activity (e.g. fidgeting) help people gain less weight?

ASSIGNMENT 3 SIMPLE LINEAR REGRESSION. Old Faithful

Chapter 3: Describing Relationships

AP Statistics Unit 6 Note Packet Linear Regression. Scatterplots and Correlation

1) A residual plot: A)

The following formulas related to this topic are provided on the formula sheet:

Chapter 12 Summarizing Bivariate Data Linear Regression and Correlation

STAT 3900/4950 MIDTERM TWO Name: Spring, 2015 (print: first last ) Covered topics: Two-way ANOVA, ANCOVA, SLR, MLR and correlation analysis

Math 243 OpenStax Chapter 12 Scatterplots and Linear Regression OpenIntro Section and

3.2: Least Squares Regressions

Lecture 4 Scatterplots, Association, and Correlation

Correlation and simple linear regression S5

Lecture 4 Scatterplots, Association, and Correlation

q3_3 MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

EDF 7405 Advanced Quantitative Methods in Educational Research MULTR.SAS

AP STATISTICS Name: Period: Review Unit IV Scatterplots & Regressions

Name Class Date. Residuals and Linear Regression Going Deeper

Ordinary Least Squares Regression Explained: Vartanian

Regression Equation. November 28, S10.3_3 Regression. Key Concept. Chapter 10 Correlation and Regression. Definitions

Regression Equation. April 25, S10.3_3 Regression. Key Concept. Chapter 10 Correlation and Regression. Definitions

Chapter 27 Summary Inferences for Regression

Area1 Scaled Score (NAPLEX) .535 ** **.000 N. Sig. (2-tailed)

Chapter 3: Describing Relationships

appstats27.notebook April 06, 2017

Objectives. 2.3 Least-squares regression. Regression lines. Prediction and Extrapolation. Correlation and r 2. Transforming relationships

ECON 497 Midterm Spring

Practice Questions for Exam 1

STATS DOESN T SUCK! ~ CHAPTER 16

1 A Review of Correlation and Regression

Scatterplots and Correlation

y n 1 ( x i x )( y y i n 1 i y 2

Example: Forced Expiratory Volume (FEV) Program L13. Example: Forced Expiratory Volume (FEV) Example: Forced Expiratory Volume (FEV)

Classroom Assessments Based on Standards Integrated College Prep I Unit 3 CP 103A

Quantitative Bivariate Data

BIVARIATE DATA data for two variables

AP Statistics. Chapter 6 Scatterplots, Association, and Correlation

AP Final Review II Exploring Data (20% 30%)

Homework. Susan Dean and Barbara Illowsky (2012)

20. Ignore the common effect question (the first one). Makes little sense in the context of this question.

STA Module 5 Regression and Correlation. Learning Objectives. Learning Objectives (Cont.) Upon completing this module, you should be able to:

Chapter 5 Friday, May 21st

USING THE EXCEL CHART WIZARD TO CREATE CURVE FITS (DATA ANALYSIS).

Review of Multiple Regression

Correlation, linear regression

7.0 Lesson Plan. Regression. Residuals

Pre-Calculus Multiple Choice Questions - Chapter S8

Computer simulation of radioactive decay

9. Linear Regression and Correlation

Advanced Quantitative Data Analysis

Warm-up Using the given data Create a scatterplot Find the regression line

Inferences for Regression

Multiple linear regression S6

Review of Regression Basics

Math 2311 Written Homework 6 (Sections )

Lesson 7: Patterns in Scatter Plots

Section 2.5 from Precalculus was developed by OpenStax College, licensed by Rice University, and is available on the Connexions website.

AP Statistics Bivariate Data Analysis Test Review. Multiple-Choice

Looking at data: relationships

Chapter 10 Correlation and Regression

We will now find the one line that best fits the data on a scatter plot.

CHAPTER 4 DESCRIPTIVE MEASURES IN REGRESSION AND CORRELATION

Correlation & Simple Regression

M 140 Test 1 B Name (1 point) SHOW YOUR WORK FOR FULL CREDIT! Problem Max. Points Your Points Total 75

AP Statistics. Chapter 9 Re-Expressing data: Get it Straight

Prof. Bodrero s Guide to Derivatives of Trig Functions (Sec. 3.5) Name:

Chapter 3: Examining Relationships

Inference for Regression Inference about the Regression Model and Using the Regression Line

Chapter 8. Linear Regression /71

Stat 101 Exam 1 Important Formulas and Concepts 1

371 Lab Rybolt Data Analysis Assignment Name

1 Correlation and Inference from Regression

Chapter 5 Least Squares Regression

MODELING. Simple Linear Regression. Want More Stats??? Crickets and Temperature. Crickets and Temperature 4/16/2015. Linear Model

Chapter 7. Scatterplots, Association, and Correlation. Scatterplots & Correlation. Scatterplots & Correlation. Stat correlation

What is the easiest way to lose points when making a scatterplot?

5. Let W follow a normal distribution with mean of μ and the variance of 1. Then, the pdf of W is

Start with review, some new definitions, and pictures on the white board. Assumptions in the Normal Linear Regression Model

Correlation and Regression

FIN822 project 2 Project 2 contains part I and part II. (Due on November 10, 2008)

Transcription:

IT 403 Practice Problems (2-2) Answers #1. Which of the following is correct with respect to the correlation coefficient (r) and the slope of the leastsquares regression line (Choose one)? a. They will always have the same sign. b. They will have opposite signs. c. Nothing, because they are two different measures that are not related to one another. a (same sign) #2. What does r 2 measure? r 2, the coefficient of determination, measures the fraction (or percent) of variability/variation in the values of y that is explained by the least-square regression of y on x. #3. [Exercise 2.78, p. 120 (slightly re-worded)] Refer to Exercise 2.75, where you examined the relationship between the number of undergraduate college students and the populations for the 50 states. Figure 2.21 gives the output from a software package for the regression. Use this output to answer the following questions: i. What is the equation of the leastsquares regression line? ˆy= 15044.917 + 0.053x ii. What is the value of r 2? 0.968 iii. Interpret the value of r 2. 96.8% of the variation in the number of undergraduates is accounted for by the population size. iv. Does the software output tell you that the relationship is linear and not, for example, curved? Explain your answer. The software does not report the nature of the relationship; it is assuming a linear relationship in the calculations shown.

#4. [Exercise 2.80, p. 121] The following 20 observations on Y and X were generated by a computer program. i. Make a scatterplot and describe the relationship between Y and X. As a note, to obtain a scatter graph in SPSS, you do Graphs >> Chart Builder, then select Scatter Plot from the Gallery list, and drag the respective variables to X and Y axes. There seems to be a weak positive linear relationship between y and x. ii. Find the equation of the least-squares regression line and add the line to your plot. Coefficients a Unstandardized Coefficients Standardized Coefficients Model B Std. Error Beta t Sig. 1 (Constant) 17.380 4.742 3.665.002 x.623.239.523 2.604.018 a. Dependent Variable: y The least-square linear regression equation is ˆy= 17.380 + 0.623x You can obtain the regression equation in SPSS by Analyze >> Regression >> Linear You can find the equation in the output titled Coefficients. Focus on the values in the column B under Unstandardized Coefficients. The intercept (b0) is shown for (Constant) and the slope is shown for x. iii. What percent of the variability in Y is explained by X?

Model Summary Model R R Square Adjusted R Square Std. Error of the Estimate 1.523 a.274.233 1.93911 a. Predictors: (Constant), x In another output, titled Model Summary, you see R Square, which is 0.274. O 27.4% of the variability in Y is explained by X. iv. Summarize your analysis of these data in a short paragraph. [Textbook solution] The x variable only accounts for 27.37% of the variation in y, so the relationship is fairly weak. #5. A chemist was conducting an experiment to find how many ml of a particular substance dissolves in different temperatures of water. A correlation of 0.87 was computed. Which interpretation is TRUE (Choose one)? a. 87% of the variation in the amount of dissolved substance is explained by temperature. b. Correlation cannot be computed because temperature is not a continuous variable. c. 76% of the variation in the amount of dissolved substance is explained by temperature. c (76%) #6. Do heavier cars use more gasoline? To answer this question, a researcher randomly selected 15 cars. He collected data about the weight (in hundreds of pounds) and the mileage (mpg) for each car. From a scatter plot made with the data, a linear model seems appropriate. The percentage of variation in mileage that is accounted for by the linear relationship between mileage and weight is approximately 44%. What is the value of the correlation coefficient between the weight and the mileage of a car? 0.44 = 0.663. So, r = 0.663. #7. Below is a plot of the Olympic gold-medal-winning performance in the high jump (in inches) for the years 1900 to 1996. The equation of the least-squares regression line of Winning Height (in inches) on Year is Winning Height = 364.90 + 0.23 Year In another millennium (the year 3000), if the Olympics continue to be held, we can expect the Winning Height to be about (Choose one):

a. 325 inches. b. 690 inches. c. none of the above. c (none of the above) #8. The British government conducts regular surveys of household spending. The average weekly household spending on tobacco products and spending on alcoholic beverages for each of 11 regions in Great Britain were recorded. A scatter plot of spending on tobacco versus spending on alcohol is given below: Determine whether each of the following statements is true or false. i. The observation in the lower-right corner of the plot is influential. -- TRUE ii. There is clear evidence of a negative association between spending on alcohol and spending on tobacco. -- FALSE iii. The equation of the least-squares regression line for this plot would be approximately y = 10 2x. -- FALSE iv. If we measured the spending in dollars instead of pounds, the correlation coefficient would decrease because a dollar is worth less than a pound. -- FALSE #9. A(n) is an observation that is substantially different from the other observations. (Choose one) a. outlier b. lurking variable c. confounding variable d. None of the above.

a (outlier) #10. It is known that not exercising may lead to poor health. However, it is possible that people who are already in poor health do not have the ability or energy to exercise. This example is one of. (Choose one) a. causation b. common response c. confounding d. None of the above. c (confounding variable) #11. [Exercise 2.96, p. 134 (slightly re-worded)] Barium-137m is a radioactive form of the element barium that decays very rapidly. It is easy and safe to use for lab experiments in schools and colleges. In a typical experiment, the radioactivity of a sample of barium-137m is measured for one minute. It is then measured for three additional one-minute periods, separated by two minutes. So data are recorded at one, three, five, and seven minutes after the start of the first counting period. The measurement units are counts. Here are the data for one of these experiments. Time Count LogCount 1 578 6.35957 3 317 5.75890 5 203 5.31321 7 118 4.77068 i. Using the least-squares regression equation count = 602.8 (74.7 time) and the observed data, find the residuals for the counts. In SPSS, you can produce residuals for every observation (i.e., the value for the response variable). To do so you do: Analyze >> Regression >> Linear And click on the button Save, and in the following window, check for Unstandardized under Residuals (as shown in the next figure).

ii. Plot the residuals versus time. The graph on the right was obtained using SPSS as a scatter plot between Time and the residuals. iii. Write a short paragraph assessing the fit of the least-squares regression line to these data based on your interpretation of the residual plot. [Textbook solution] There is a clear curve in the residual plot; this is not a good model for these data.

#12. [Exercise 2.101, p. 134] What s wrong? Each of the following statements contains an error. Describe each error and explain why the statement is wrong. i. An influential observation will always have a large residual. If the line is pulled toward the influential point, the observation will not necessarily have a large residual. ii. High correlation is never present when there is causation. High correlation is always present if there is causation. iii. If we have data at values of x equal to 1, 2, 3, 4, and 5, and we try to predict the value of y for x = 2.5 using a least-squares regression equation, we are doing an extrapolation. Extrapolation is using a regression to predict for x-values outside the range of the data (here, using 20, for example). #13. Correlations caused by lurking variables are called. (Choose one) a. nonsense correlations b. association correlations c. reverse correlations d. None of the above. a (nonsense correlations)