HOMEWORK ANALYSIS #2 - STOPPING DISTANCE

Similar documents
HOMEWORK ANALYSIS #3 - WATER AVAILABILITY (DATA FROM WEISBERG 2014)

Simple Linear Regression for the Advertising Data

1 D motion: know your variables, position, displacement, velocity, speed acceleration, average and instantaneous.

There are 6 questions and 6 pages (including this one). MAKE SURE THAT YOU HAVE THEM ALL.

AP CALCULUS AB SUMMER ASSIGNMNET NAME: READ THE FOLLOWING DIRECTIONS CAREFULLY

Chapter 7. Linear Regression (Pt. 1) 7.1 Introduction. 7.2 The Least-Squares Regression Line

appstats27.notebook April 06, 2017

Multiple Linear Regression for the Supervisor Data

MATH 2070 Test 1 (Sections )

Chapter 27 Summary Inferences for Regression

Bivariate Data: Graphical Display The scatterplot is the basic tool for graphically displaying bivariate quantitative data.

Chi-square tests. Unit 6: Simple Linear Regression Lecture 1: Introduction to SLR. Statistics 101. Poverty vs. HS graduate rate

Physics 2048 Test 3 Dr. Jeff Saul Spring 2001

The stopping distance of a car is the sum of the thinking distance and the braking distance.

Simple Linear Regression for the MPG Data

Nonlinear Regression Curve Fitting and Regression (Statcrunch) Answers to selected problems

AP Physics C: Electricity and Magnetism

2/18/2019. Position-versus-Time Graphs. Below is a motion diagram, made at 1 frame per minute, of a student walking to school.

HW Unit 7: Connections (Graphs, Equations and Inequalities)

Practice Final Solutions. 1. Consider the following algorithm. Assume that n 1. line code 1 alg(n) { 2 j = 0 3 if (n = 0) { 4 return j

Statistical and Econometric Methods

MATH 1040 Test 2 Spring 2016 Version A QP 16, 17, 20, 25, Calc 1.5, 1.6, , App D. Student s Printed Name:

Solving Equations. Another fact is that 3 x 4 = 12. This means that 4 x 3 = = 3 and 12 3 = will give us the missing number...

AP Physics II Assignment #3

Chapter 8. Linear Regression /71

1 Motivation for Instrumental Variable (IV) Regression

Chapter 7 Linear Regression

AP Physics 2: Algebra-Based

Position-versus-Time Graphs

Algebra Exam. Solutions and Grading Guide

Multiple-Choice Answer Key

Relationships Regression

The following formulas related to this topic are provided on the formula sheet:

HOLLOMAN S AP STATISTICS BVD CHAPTER 08, PAGE 1 OF 11. Figure 1 - Variation in the Response Variable

What to do if Assumptions are Violated?

GMA Review Packet Answer Key. Unit Conversions 1) 2 NY15(2) 2) 2.56 TX14(34) Linear Equations and Inequalities 1) 1 NY15(7) 2) 3 NY15(13)

Math 3 Variable Manipulation Part 1 Algebraic Systems

Announcements. Unit 6: Simple Linear Regression Lecture : Introduction to SLR. Poverty vs. HS graduate rate. Modeling numerical variables

Multicollinearity occurs when two or more predictors in the model are correlated and provide redundant information about the response.

Chapter 3. Measuring data

Name Class Date. Inverse of Function. Understanding Inverses of Functions

Physics I (Navitas) EXAM #1 Fall 2015

* * MATHEMATICS (MEI) 4767 Statistics 2 ADVANCED GCE. Monday 25 January 2010 Morning. Duration: 1 hour 30 minutes. Turn over

STATISTICS 174: APPLIED STATISTICS TAKE-HOME FINAL EXAM POSTED ON WEBPAGE: 6:00 pm, DECEMBER 6, 2004 HAND IN BY: 6:00 pm, DECEMBER 7, 2004 This is a

Physics of Everyday Phenomena. Chapter 2

The scatterplot is the basic tool for graphically displaying bivariate quantitative data.

Vector and Relative motion discussion/ in class notes. Projectile Motion discussion and launch angle problem. Finish 2 d motion and review for test

of 8 28/11/ :25

Algebra I, 1st 4.5 weeks

Unit 01 Motion with constant velocity. What we asked about

Vectors Mini Project Materials Part I Velocity Vectors

AP Physics C: Electricity and Magnetism

Chapter 6 The Standard Deviation as a Ruler and the Normal Model

1 Correlation between an independent variable and the error

AP PHYSICS 2011 SCORING GUIDELINES (Form B)

Water tank. Fortunately there are a couple of objectors. Why is it straight? Shouldn t it be a curve?

Assessment Report. Level 2, Mathematics

Section 3: Simple Linear Regression

Test 3 solution. Problem 1: Short Answer Questions / Multiple Choice a. => 1 b. => 4 c. => 9 d. => 8 e. => 9

Stat 500 Midterm 2 12 November 2009 page 0 of 11

Warm-up Using the given data Create a scatterplot Find the regression line

Sociology 593 Exam 2 Answer Key March 28, 2002

MULTIPLE REGRESSION METHODS

MthSc 103 Test 3 Spring 2009 Version A UC , 3.1, 3.2. Student s Printed Name:

Linear Regression. Linear Regression. Linear Regression. Did You Mean Association Or Correlation?

Physics! Unit 2 Review Constant Acceleration Particle Model

Name. University of Maryland Department of Physics

STA Module 5 Regression and Correlation. Learning Objectives. Learning Objectives (Cont.) Upon completing this module, you should be able to:

PhysicsAndMathsTutor.com

CLASS NOTES: BUSINESS CALCULUS

Conceptual Explanations: Simultaneous Equations Distance, rate, and time

Prentice Hall Algebra Correlated to: South Dakota Mathematics Standards, (Grades 9-12)

Name. University of Maryland Department of Physics

AP Statistics. Chapter 9 Re-Expressing data: Get it Straight

MATH 1070 Test 3 Spring 2015 Version A , 5.1, 5.2. Student s Printed Name: Key_&_Grading Guidelines CUID:

A booklet Mathematical Formulae and Statistical Tables might be needed for some questions.

Chapter 8. Linear Regression. Copyright 2010 Pearson Education, Inc.

Staple Here. Student Name: End-of-Course Assessment. Algebra I. Pre-Test

Foundations for Functions

Chapter 18 Sampling Distribution Models

Student s Printed Name: _ Key _&_Grading Guidelines CUID:

Unit D Energy-Analysis Questions

STAT 512 MidTerm I (2/21/2013) Spring 2013 INSTRUCTIONS

DEPARTMENT OF MATHEMATICS

Math 2311 Written Homework 6 (Sections )

Looking at data: relationships

Experiment: Go-Kart Challenge

AMS 7 Correlation and Regression Lecture 8

LECTURE 15: SIMPLE LINEAR REGRESSION I

Chapter 2: Looking at Data Relationships (Part 3)

Math 3339 Homework 2 (Chapter 2, 9.1 & 9.2)

value mean standard deviation

Table 2.1 presents examples and explains how the proper results should be written. Table 2.1: Writing Your Results When Adding or Subtracting

GCSE MARKING SCHEME SUMMER 2017 GCSE (NEW) MATHEMATICS - UNIT 1 (HIGHER) 3300U50-1. WJEC CBAC Ltd.

FORCE AND MOTION SEPUP UNIT OVERVIEW

Psychology 282 Lecture #3 Outline

EXPERIMENT: REACTION TIME

Centripetal Force and Centripetal Acceleration Questions

Machine Learning, Fall 2011: Homework 5

Grading for MT1.1A. Dapo Omidiran There are three different ways to solve this problem:

Transcription:

HOMEWORK ANALYSIS #2 - STOPPING DISTANCE Total Points Possible: 35 1. In your own words, summarize the overarching problem and any specific questions that need to be answered using the stopping distance data. Discuss how statistical modeling will be able to answer the posed questions. (a) (1 pt) Discuss the potential value of determining stopping distance from the speed of a car in determining speed limits. Safety, etc. could be mentioned. (b) (2 pts) The main interest in this problem is predicting stopping distance of cars based on their speeds. -0.5 pt if there is a decent explanation, but the word prediction is missing. (c) (2 pts) Statistical modeling can help predictions by providing a quantifiable relationship where speeds can be plugged in and stopping distance predicted. 2. Use the data to assess if a simple linear regression model (without doing any transformations) is suitable to analyze the stopping distance data. Justify your answer using any necessary graphics and relevant summary statistics. Provide discussion on why an SLR model on the raw data (not transformed) is or is not appropriate. (a) (2.5 pts) Draw plot (e.g. a scatterplot, fitted vs. residuals or both). -1 pt if there are incorrect label(s), or something else is wrong with the plot(s) (b) (2.5 pts) Discuss, in writing, why fitting a linear model is a bad idea (they only need to mention at least one of the following to receive full points): The scatterplot indicates the data has a curved relationship, violating the linearity assumption. Residuals vs. fitted-values plot shows there is more variation in stopping distances at higher speeds, violating the equal variance assumption. Histogram of Residuals 2 1 0 1 2 3 Density 0.0 0.1 0.2 0.3 0.4 0 20 40 60 80 100 Fitted Values 3 2 1 0 1 2 3 4 1

3. Write out (in mathematical form) a justifiable (perhaps after a transformation) SLR model that would help answer the questions in problem. Provide an interpretation of each mathematical term (variable or parameter) included in your model. Using the mathematical form, discuss how your model, after fitting it to the data, will be able to answer the questions in this problem. The model needs a transformation. Several transformations are possible. (a) (2 pts) Write out their model in equation form. The following are preferable transformations: log() = β 0 + β 1 log() + ɛ i where ɛ i N(0, σ 2 ) (Model 1) = β 0 + β 1 + ɛi where ɛ i N(0, σ 2 ) (Model 2) = β 0 + β 1 + ɛ i where ɛ i N(0, σ 2 ) (Model 3) The following are poor transformations (-1.5 pts if one of these is used): = β 0 + β 1 log() + ɛ i where ɛ i N(0, σ 2 ) (Model 4) log() = β 0 + β 1 + ɛ i where ɛ i N(0, σ 2 ) (Model 5) = β 0 + β 1 + ɛi where ɛ i N(0, σ 2 ) (Model 6) The following is the untransformed model (-2 pts if used): = β 0 + β 1 + ɛ i where ɛ i N(0, σ 2 ) (Model 7) Subtract 0.5 pt for any missing parts, including ɛ i. (b) (3 pts) Define y i, x i, and ɛ i and interpret β 0 and β 1 correctly (depends on their transformation, 0.5 pt each). Make sure they keep interpretations in the units of the transformed variables, not the originals. If they interpret the variables in terms of the original untransformed data, but the interpretations are otherwise correct, subtract 1.5 pts. 4. List, then discuss and justify your model assumptions using appropriate graphics or summary statistics. (a) (1 pt) List the assumptions of linearity, independence, normality, and homoskedasticity. (b) (4 pts) Discuss and justify the assumptions of linearity, independence, normality and homoskedasticity (1 pt for each assumption). For linearity, a scatterplot of the transformed data should be used. The correlation could also be mentioned. For independence, a reasonable explanation is all that is necessary. A residuals vs. fitted values plot could also be utilized, but is not necessary. For normality, a histogram of standardized residuals should be used. The KS or JB test could also be used. A Q-Q plot is another option. For equal variance, the BP test could be used, or a discussion regarding one of the plots above could be used. 2

Histogram of Residuals 2 1 0 1 2 Density 0.0 0.1 0.2 0.3 0.4 1 2 3 4 Fitted Values 2 1 0 1 2 Transformed Scatterplot log() 1 2 3 4 5 1.5 2.0 2.5 3.0 3.5 log() 5. Assess and interpret the fit and predictive accuracy of your model on the level of your target audience. (a) (2 pts) Report R 2 (1 pt) and interpret it in context (1 pt) (% of the variation in (potentially transformed) y is explained by (potentially transformed) x. R 2 Model 1 0.902 Model 2 0.906 Model 3 0.925 Model 4 0.723 Model 5 0.868 Model 6 0.816 Model 7 0.878 (b) (3 pts) Perform cross validation to assess predictive accuracy and interpret the results. Students should report bias and RMSPE and interpret these. Note, because of random variation in the simulation, bias and RMSPE values will differ. Give full points for reasonable answers with reasonable interpretations. -1 pt for insufficient or unclear interpretations -2 pts if cross validation was attempted, but all answers are clearly wrong 3

-3 pts if cross validation was not attempted 6. Fit your model in #3 to the stopping distance data and summarize the results by displaying the fitted model in equation form (do NOT just provide a screen shot of the R or SAS output). Interpret each of the fitted parameters in the context of the problem. Provide a plot of the data with a fitted regression line on the original scale of the data. (a) (2 pts) Report coefficients in equation form. log( ) = 1.102 + 1.568 log() (1) = 3.117 + 2.107 (2) = 0.932 + 0.252 (3) = 91.022 + 46.889 log() (4) log( ) = 1.487 + 0.094 (5) = 67.681 + 25.540 (6) = 20.131 + 3.142 (7) (b) (2 pts) Interpret coefficients in the context of the problem. E.g. As log() goes up by 1, then log(y) goes up by 1.568 on average. (c) (1 pt) Provide a plot on the original scale of the data like the one below (including a fitted regression line). -0.5 pt if a plot was attempted, but something is wrong with it (line seems off, variables are switched, etc.) -1 pt if there isn t a plot 4

7. The local law enforcement is considering implementing a speed limit of 35 MPH. Use your model to obtain a prediction of the distance required by a vehicle to stop when traveling at 35 MPH. How much of a reduction in stopping distance would be achieved by making it a 30 MPH speed limit instead? Given that the road is a rural road with many homes, provide an argument for or against the use of 35 MPH. (a) (3 pts) Predict at 35 MPH and then at 30 MPH (1.5 pts each). 30 MPH 35 MPH Model 1 68.79 87.60 Model 2 70.95 87.38 Model 3 72.36 95.43 Model 4 68.46 75.68 Model 5 73.15 116.76 Model 6 72.21 83.41 Model 7 74.12 89.83 (b) (2 pts) Provide some form of argument that the 30 MPH speed limit is preferred. Any reasonable argument gets full credit. 5