Lecture 1 Linear Regression with One Predictor Variable.p2

Similar documents
Lecture 3: Inference in SLR

Failure Time of System due to the Hot Electron Effect

2.1: Inferences about β 1

EXST Regression Techniques Page 1. We can also test the hypothesis H :" œ 0 versus H :"

Chapter 2 Inferences in Simple Linear Regression

STA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007

Table 1: Fish Biomass data set on 26 streams

Lecture 11: Simple Linear Regression

STA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #6

Chapter 1 Linear Regression with One Predictor

Overview Scatter Plot Example

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.

Lecture 2 Simple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: Chapter 1

Lecture 10 Multiple Linear Regression

Topic 14: Inference in Multiple Regression

Formal Statement of Simple Linear Regression Model

General Linear Model (Chapter 4)

Lecture notes on Regression & SAS example demonstration

Inference for Regression Simple Linear Regression

STOR 455 STATISTICAL METHODS I

SSR = The sum of squared errors measures how much Y varies around the regression line n. It happily turns out that SSR + SSE = SSTO.

Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression

STAT Chapter 11: Regression

Analysis of Variance. Source DF Squares Square F Value Pr > F. Model <.0001 Error Corrected Total

Ch 2: Simple Linear Regression

Inference for Regression Inference about the Regression Model and Using the Regression Line

Lecture 12 Inference in MLR

Regression Models - Introduction

STAT 3A03 Applied Regression With SAS Fall 2017

Inference for the Regression Coefficient

ST505/S697R: Fall Homework 2 Solution.

Chapter 8 (More on Assumptions for the Simple Linear Regression)

Correlation Analysis

6. Multiple Linear Regression

using the beginning of all regression models

Confidence Intervals, Testing and ANOVA Summary

Nonparametric Regression and Bonferroni joint confidence intervals. Yang Feng

Confidence Interval for the mean response

Multiple Regression. Inference for Multiple Regression and A Case Study. IPS Chapters 11.1 and W.H. Freeman and Company

Correlation and the Analysis of Variance Approach to Simple Linear Regression

ST 512-Practice Exam I - Osborne Directions: Answer questions as directed. For true/false questions, circle either true or false.

Statistics 512: Applied Linear Models. Topic 1

Chapter 12 - Lecture 2 Inferences about regression coefficient

Lecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is

PubH 7405: REGRESSION ANALYSIS SLR: DIAGNOSTICS & REMEDIES

Chapter 14 Student Lecture Notes Department of Quantitative Methods & Information Systems. Business Statistics. Chapter 14 Multiple Regression

Chapter 6 Multiple Regression

Chapter 8 Quantitative and Qualitative Predictors

Mathematics for Economics MA course

Inferences for Regression

Outline. Review regression diagnostics Remedial measures Weighted regression Ridge regression Robust regression Bootstrapping

Basic Business Statistics, 10/e

df=degrees of freedom = n - 1

Simple Linear Regression

Basic Business Statistics 6 th Edition

Statistics for Managers using Microsoft Excel 6 th Edition

STATISTICS 110/201 PRACTICE FINAL EXAM

Objectives Simple linear regression. Statistical model for linear regression. Estimating the regression parameters

IES 612/STA 4-573/STA Winter 2008 Week 1--IES 612-STA STA doc

In Class Review Exercises Vartanian: SW 540

Math 3330: Solution to midterm Exam

STAT 705 Chapter 16: One-way ANOVA

Course Information Text:

Multiple Linear Regression

Biostatistics 380 Multiple Regression 1. Multiple Regression

SAS Procedures Inference about the Line ffl model statement in proc reg has many options ffl To construct confidence intervals use alpha=, clm, cli, c

Lecture 2 Linear Regression: A Model for the Mean. Sharyn O Halloran

STAT 540: Data Analysis and Regression

Figure 1: The fitted line using the shipment route-number of ampules data. STAT5044: Regression and ANOVA The Solution of Homework #2 Inyoung Kim

COMPREHENSIVE WRITTEN EXAMINATION, PAPER III FRIDAY AUGUST 26, 2005, 9:00 A.M. 1:00 P.M. STATISTICS 174 QUESTION

Lecture 9: Linear Regression

Concordia University (5+5)Q 1.

Regression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont.

Linear models and their mathematical foundations: Simple linear regression

ST Correlation and Regression

Inference in Normal Regression Model. Dr. Frank Wood

Model Building Chap 5 p251

Multicollinearity Exercise

Remedial Measures, Brown-Forsythe test, F test

36-309/749 Experimental Design for Behavioral and Social Sciences. Sep. 22, 2015 Lecture 4: Linear Regression

3 Variables: Cyberloafing Conscientiousness Age

Oct Simple linear regression. Minimum mean square error prediction. Univariate. regression. Calculating intercept and slope

SPECIAL TOPICS IN REGRESSION ANALYSIS

Topic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model

STA 4210 Practise set 2a

Topic 18: Model Selection and Diagnostics

Ch 3: Multiple Linear Regression

Topic 7: Analysis of Variance

MSc / PhD Course Advanced Biostatistics. dr. P. Nazarov

Econ 3790: Business and Economics Statistics. Instructor: Yogesh Uppal

ECON3150/4150 Spring 2015

y ˆ i = ˆ " T u i ( i th fitted value or i th fit)

13 Simple Linear Regression

Topic 20: Single Factor Analysis of Variance

ECON3150/4150 Spring 2016

LINEAR REGRESSION ANALYSIS. MODULE XVI Lecture Exercises

Measuring the fit of the model - SSR

Simple Regression Model Setup Estimation Inference Prediction. Model Diagnostic. Multiple Regression. Model Setup and Estimation.

[4+3+3] Q 1. (a) Describe the normal regression model through origin. Show that the least square estimator of the regression parameter is given by

Unit 10: Simple Linear Regression and Correlation

Transcription:

Lecture Linear Regression with One Predictor Variablep - Basics - Meaning of regression parameters p - β - the slope of the regression line -it indicates the change in mean of the probability distn of Y per unit increase in X - β 0 - the Y intercept of the regression line - Estimation of regression function 6 p5 - Least squares method p5 - Example SAS output SAS code for Q8 p37 options ls=75 nodate; data crime; input Y X; datalines; 8487 74 879 8 836 8 80 8 646 87 900 66 363 9 8040 88 698 83 758 76 ; proc plot; plot Y*X; proc reg ; model Y=X ; run;

Q8 p37 The SAS System Plot of Y*X Legend: A = obs, B = obs, etc Y 4000 + A A 000 + A A A A A B A A A 0000 + A A A A A A A A A A A A A A B A 8000 + A A A A A A A A A A AA A A AA A A A C A A 6000 + A B A A A A A B A A A A A A A A A 4000 + A A A A A A A A A A A 000 + A ---+--------+--------+--------+--------+--------+--------+--------+-- 60 65 70 75 80 85 90 95 X ^L

The SAS System $ Model: MODEL Dependent Variable: Y Analysis of Variance Sum of Mean Source DF Squares Square F Value Prob>F Model 93469467 93469467 6834 0000 Error 8 45573659 555779 C Total 83 5487360756 Root MSE 356995 R-square 0703 Dep Mean 7038 Adj R-sq 060 CV 333493 Parameter Estimates Parameter Standard T for H0: Variable DF Estimate Error Parameter=0 Prob > T INTERCEP 058 37764698 660 0000 X -7057589 4574378-403 0000 3

- Properties of least squares p8 - Point estimation of mean response p - Residuals p - Example In q8 p37, the regression equation is: Y = 058-7057589X When X = 74, the predicted value of Y (ie Y ˆ ) = 058-7057589 x 74 = 78954367 Residual = Y Yˆ =8487-78954367 = 595633 SAS output options ls=75 nodate; data crime; input Y X; datalines; 8487 74 879 8 698 83 758 76 ; proc plot; plot Y*X; proc reg ; model Y=X ; output out=a p=pred r=resid; proc print data=a; var Y pred resid; run; 4

The SAS System OBS Y PRED RESID 8487 789504 5996 879 653043 64857 3 836 6700 66099 4 80 6700 5899 5 646 567756 56844 6 900 95964-5964 7 656 89849-35749 8 5873 6700-880 9 7993 789504 9796 0 793 653043 4057 - Properties of the fitted line p3 o ei = 0 n i= n o Xe i i= 0 i= n o 3 Ye ˆ i i= 0 i= o - Estimation of the error variance - Normal error regression model 8 p6 - MLE s of parameters Practice Problems 7, 3, 34 5

Chap- p40 Inference in Regression and Correlation Analysis Sampling distribution of b p4 -unbiased -linear -has minimum variance -estimated variance p43 b β -sampling distribution of p44 SE( b ) -CI s for β p45 -Tests concerning β p47 -Sampling distribution of β0 -mean, variance - sampling distribution of (b0-β0)/se(b0) -CI s for β0 3 Some considerations on making inferences concerning β and β0 p50 -Effects of departure from normality p50 -Spacing of x-values p50 -Power of tests p50 4 Interval Estimation of EY ( h ) p5 -Sampling distribution of Y p5 ˆh Y -Sampling distribution of ˆ h E( Yh) ~t_(n-) sy ( ˆ ) -CI s for EY ( h ) p54 -Example 5 Prediction of New Observations p55 - Example - Prediction of m new observations for given X h h - Confidence band for Regression line p6 - Working-Hotelling - α confidence band for the regression line ˆ Y ± Ws{ Y } where W = F ( α;, n ) h ˆh 7 ANOVA approach to Regression analysis p63 - Partitioning the Total SS P63 SSTO = SSR + SSE 6

- ANOVA table p67 - Expected MS p68, E(SSE) = σ and E(SSR) = σ + β ( X X) i - F test for H0 : β = 0 vs H: β 0 p69 - Example Q8 above - Equivalence of F and the t test for H0 : β = 0 vs H: β 0 p7 8 General linear test procedure p7 9 Descriptive measures of linear association between X and Y p74 R-sq = SSR/SSTO = SSE/SSTO - Example: See SAS output for Q 8 above - Limitations of R-sq p75 Misunderstanding A high R-sq indicates that useful predictions can be made This is not necessarily correct Example: In the Toluca Company example R-sq = 08, but the 90% CI for a new lot consisting of 00 units is wide (33, 50656) and not precise enough to permit management to schedule workers effectively Model: MODEL Dependent Variable: Y The SAS System Analysis of Variance Sum of Mean Source DF Squares Square F Value Prob>F Model 53775808 53775808 05876 0000 Error 3 54854599 383756 C Total 4 3070304000 Root MSE 48833 R-square 085 Dep Mean 38000 Adj R-sq 0838 CV 563447 Parameter Estimates Parameter Standard T for H0: Variable DF Estimate Error Parameter=0 Prob > T INTERCEP 6365859 67743389 38 0059 X 35700 0346976 090 0000 7

Misunderstanding A high R-sq indicates that the estimated regression line is a good fit : - not necessarily correct Misunderstanding 3 An R-sq near zero indicates that X and Y are not related: - not necessarily correct -Coefficient of correlation A measure of linear association between Y and X when both Y and X are random is the coefficient of correlation This measure is the signed square root of R-sq: r =± R A plus or minus sign is attached to this measure according to whether the slope of the regression line is positive or negative Example: For the Toluca company example above, R-sq = 08 Treating X as a random variable r = sqrt(08) = 0907 0 Considerations in applying regression analysis p77 - read p77 -Inferences on correlation coefficients p83 The coefficient of correlation ( ρ ) between two rvs Y and Y = σ = E{( Y µ )( Y µ )} - A point estimator of : The MLE of ρ is (p83) ρ - Testing H 0 : ρ = 0 vs H: ρ 0 - Test statistic is * r n t = t( n ) r r = ( Yi Y)( Yi Y) ( ( Yi Y) ( Yi Y) ) / Example: In the Toluca company example above r = 0907 If we are interested in testing H0 : ρ 0 vs H: ρ > 0, σ σσ where 8

t * 0907 5 = = 05 >t(5-, 095) and so reject H 0 : ρ 0 0907 9