Statistics MINITAB - Lab 2

Similar documents
Chapter 11: Simple Linear Regression and Correlation

Lecture Notes for STATISTICAL METHODS FOR BUSINESS II BMGT 212. Chapters 14, 15 & 16. Professor Ahmadi, Ph.D. Department of Management

[The following data appear in Wooldridge Q2.3.] The table below contains the ACT score and college GPA for eight college students.

Chapter 9: Statistical Inference and the Relationship between Two Variables

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

Statistics for Economics & Business

Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation

Correlation and Regression. Correlation 9.1. Correlation. Chapter 9

Basic Business Statistics, 10/e

Introduction to Regression

Chapter 14 Simple Linear Regression

28. SIMPLE LINEAR REGRESSION III

Linear regression. Regression Models. Chapter 11 Student Lecture Notes Regression Analysis is the

STAT 3008 Applied Regression Analysis

NANYANG TECHNOLOGICAL UNIVERSITY SEMESTER I EXAMINATION MTH352/MH3510 Regression Analysis

18. SIMPLE LINEAR REGRESSION III

The Ordinary Least Squares (OLS) Estimator

Y = β 0 + β 1 X 1 + β 2 X β k X k + ε

Comparison of Regression Lines

Biostatistics. Chapter 11 Simple Linear Correlation and Regression. Jing Li

Scatter Plot x

SIMPLE LINEAR REGRESSION

Chapter 13: Multiple Regression

STAT 3340 Assignment 1 solutions. 1. Find the equation of the line which passes through the points (1,1) and (4,5).

/ n ) are compared. The logic is: if the two

Statistics for Business and Economics

Learning Objectives for Chapter 11

Regression Analysis. Regression Analysis

2016 Wiley. Study Session 2: Ethical and Professional Standards Application

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)

β0 + β1xi. You are interested in estimating the unknown parameters β

Statistics MINITAB - Lab 5

Linear Regression Analysis: Terminology and Notation

Lecture 6: Introduction to Linear Regression

Negative Binomial Regression

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.

e i is a random error

STAT 511 FINAL EXAM NAME Spring 2001

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

β0 + β1xi. You are interested in estimating the unknown parameters β

Properties of Least Squares

STATISTICS QUESTIONS. Step by Step Solutions.

Sociology 301. Bivariate Regression. Clarification. Regression. Liying Luo Last exam (Exam #4) is on May 17, in class.

Chapter 3 Describing Data Using Numerical Measures

Resource Allocation and Decision Analysis (ECON 8010) Spring 2014 Foundations of Regression Analysis

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

Topic 7: Analysis of Variance

Biostatistics 360 F&t Tests and Intervals in Regression 1

x yi In chapter 14, we want to perform inference (i.e. calculate confidence intervals and perform tests of significance) in this setting.

where I = (n x n) diagonal identity matrix with diagonal elements = 1 and off-diagonal elements = 0; and σ 2 e = variance of (Y X).

This column is a continuation of our previous column

Chapter 14 Simple Linear Regression Page 1. Introduction to regression analysis 14-2

Statistics for Managers Using Microsoft Excel/SPSS Chapter 14 Multiple Regression Models

Linear Correlation. Many research issues are pursued with nonexperimental studies that seek to establish relationships among 2 or more variables

Midterm Examination. Regression and Forecasting Models

First Year Examination Department of Statistics, University of Florida

Activity #13: Simple Linear Regression. actgpa.sav; beer.sav;

β0 + β1xi and want to estimate the unknown

Lecture 16 Statistical Analysis in Biomaterials Research (Part II)

Lecture 3 Stat102, Spring 2007

UNIVERSITY OF TORONTO. Faculty of Arts and Science JUNE EXAMINATIONS STA 302 H1F / STA 1001 H1F Duration - 3 hours Aids Allowed: Calculator

Statistics Chapter 4

APPENDIX 2 FITTING A STRAIGHT LINE TO OBSERVATIONS

Lab 4: Two-level Random Intercept Model

Statistics II Final Exam 26/6/18

a. (All your answers should be in the letter!

Statistical Evaluation of WATFLOOD

17 - LINEAR REGRESSION II

Interval Estimation in the Classical Normal Linear Regression Model. 1. Introduction

Chapter 15 - Multiple Regression

Correlation and Regression

ISQS 6348 Final Open notes, no books. Points out of 100 in parentheses. Y 1 ε 2

Chapter 15 Student Lecture Notes 15-1

is the calculated value of the dependent variable at point i. The best parameters have values that minimize the squares of the errors

Chapter 8 Indicator Variables

Economics 130. Lecture 4 Simple Linear Regression Continued

8/25/17. Data Modeling. Data Modeling. Data Modeling. Patrice Koehl Department of Biological Sciences National University of Singapore

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics

The SAS program I used to obtain the analyses for my answers is given below.

Measuring the Strength of Association

Q1: Calculate the mean, median, sample variance, and standard deviation of 25, 40, 05, 70, 05, 40, 70.

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

Reminder: Nested models. Lecture 9: Interactions, Quadratic terms and Splines. Effect Modification. Model 1

Some basic statistics and curve fitting techniques

Laboratory 1c: Method of Least Squares

Systematic Error Illustration of Bias. Sources of Systematic Errors. Effects of Systematic Errors 9/23/2009. Instrument Errors Method Errors Personal

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Exam. Econometrics - Exam 1

Sociology 301. Bivariate Regression II: Testing Slope and Coefficient of Determination. Bivariate Regression. Calculating Expected Values

Unit 10: Simple Linear Regression and Correlation

Laboratory 3: Method of Least Squares

LINEAR REGRESSION MODELS W4315

Diagnostics in Poisson Regression. Models - Residual Analysis

Chapter 3. Two-Variable Regression Model: The Problem of Estimation

Polynomial Regression Models

PubH 7405: REGRESSION ANALYSIS SLR: PARAMETER ESTIMATION

Chemometrics. Unit 2: Regression Analysis

Mud-rock line estimation via robust locally weighted scattering smoothing method

Transcription:

Statstcs 20080 MINITAB - Lab 2 1. Smple Lnear Regresson In smple lnear regresson we attempt to model a lnear relatonshp between two varables wth a straght lne and make statstcal nferences concernng that lnear model. Usng the fre damage dataset from last week, we are assumng here that the varable on the x-axs (the dstance from the fre staton) wll predct the amount of fre damage caused to the house. In ths case therefore, dstance from the fre staton s the predctor varable and the damage to the property s the response varable. 2. Fttng the Lne Construct a scatter plot of the data to determne the nature of the relatonshp between the two varables. Then calculate the correlaton coeffcent, whch descrbes the relatonshp numercally. The next step s to calculate the equaton of the least squares regresson lne, whch s the data s lne of best ft. Ths step s what s known as fttng the lne. The reason ths s constructed s to help the researcher to see any trends and make predctons. The lne of best ft s needed because we want to predct the values of y from the values of x. In other words we want to predct the damage n s usng the dstance from the fre staton. When fttng a straght-lne model we ft what s called the least squares lne. Ths s a straght lne such that the vertcal dstance between the ponts and the lne s kept at a mnmum. An equaton for a straght-lne model has two components, the ntercept and the slope. Therefore the equaton of the least squares regresson lne takes the form, Response = ntercept + slope* (predctor varable) + ε (the error or resdual term) Or more generally: ŷ = a+ bx+ε Where a s the ntercept b s the slope of the lne ε s the dstance between the ftted lne and the data pont (I.e. The resduals) X s the chosen value of the predctor varable 1

Summary from lecture notes The formulae for the estmates of the slope and the ntercept are; Slope: SSxy b = Intercept: a = y bx SS x Where SS = ( x x)( y y) SS xy xx = ( x x) = xy ( x ) 2 2 = x n ( x)( y) 2 n n = sample sze 3. Fttng A Regresson Model n MINITAB Usng the drop down menus n Mntab, go to Stat - Regresson - Ftted Lne Plot 1. Select the response varable here 2. Select the predctor varable here 3. Ensure that the lnear model s selected Ths command wll gve you a scatter plot of the response varable versus the predctor varable wth the least squares lne shown n blue on the plot. The least squares regresson equaton wll be dsplayed over the plot. If you look at the sesson wndow you wll also see the ANOVA table for ths model and the assocated p-value, smlar to the table below. We wll cover what ths ANOVA table means n the next class. 2

Regresson Analyss: Damage - $ versus Dstance The regresson equaton s Damage - $ = 10.28 + 4.919 Dstance Regresson Lne S = 2.31635 R-Sq = 92.3% R-Sq(adj) = 91.8% Analyss of Varance Source DF SS MS F P Regresson 1 841.766 841.766 156.89 0.000 Error 13 69.751 5.365 Total 14 911.517 ANOVA Table What s the least squares regresson equaton? What s the slope? What s the ntercept? What type of relatonshp s there between dstance and damage? Now that we know what the least squares regresson equaton s we can use t to make predctons for the dependent varable. If a buldng whch was on fre was 10 mles from the nearest fre staton how much damage would be caused to t n the event of a fre? Hnt: Substtute 10 n for X n the regresson lne equaton. (a) For each of the datasets used n last week s lab, fnd the least squares regresson equaton, the slope and the ntercept. 1. 2. 3. 4. 3

Answer the followng usng your answer from (a) If the three most recent volunteers n a blood donaton clnc had blood pressures of 86, 91 and 101 respectvely, what would your estmate of ther platelet-calcum concentratons be? Fve people were randomly selected and ther heghts were measured to be 145, 150, 161, 165 and 177cm. What are the estmated weghts of these people? If you look more closely at the fgures you have just calculated and compare them to the actual values from the orgnal dataset you wll notce that they are not exactly the same. Ths s because the calculated fgures are ftted usng the regresson lne (ndcated n blue on the scatter plot). The dscrepances between the two numbers are the resduals. The coeffcent of correlaton s one way of quantfyng how large ths dscrepancy s. 6. The Coeffcent of Determnaton - R 2 How much of the varaton n y s explaned by the lnear relatonshp between x and y? The answer to ths s gven by the Coeffcent of Determnaton or R 2. The Coeffcent of Determnaton s the rato between the total varaton n the data and varaton 'explaned' by the lnear relatonshp between the predctor and response varables. Coeffcent of Determnaton - R 2 R 2 = SS regresson / SS Total What s R 2 for the regresson model ftted to the fre damage dataset? What does ths fgure mean? Note, that n the case of a smple lnear regresson model the coeffcent of determnaton s the correlaton coeffcent squared. Calculate the square root of R 2 and compare t to the correlaton coeffcent. 4

Calculate and nterpret the coeffcent of determnaton for each of the datasets from last week s lab. 1. 2. 3. 4. 5

Assgnment: Due 2 week s tme. From the Mntab class page download the fle named TV. Ths contans the data for 15 students on ther fnal year mark and the number of hours they spend watchng TV. 1. Construct a smple lnear regresson lne for ths data and show the graph. 2. What s the correlaton coeffcent for ths data? 3. Is there a negatve or a postve correlaton between the number of hours spent watchng TV and the end of year grade? 4. What s the value of the ntercept wth the regresson lne and the y-axs? 5. What s the slope of ths regresson lne? Assgnments should be handed n at the begnnng of class two weeks from today. Late assgnments wll not be accepted. REVISION SUMMARY After ths lab you should be able to: - Calculate the correlaton coeffcent by hand and n Mntab - Ft a smple lnear regresson lne to data usng Mntab - Make predctons from the least squares regresson lne 6