STA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #1

Similar documents
12.2 Estimating Model parameters Assumptions: ox and y are related according to the simple linear regression model

Lecture Notes Types of economic variables

Simple Linear Regression

ECON 482 / WH Hong The Simple Regression Model 1. Definition of the Simple Regression Model


STA302/1001-Fall 2008 Midterm Test October 21, 2008

Lecture 3. Sampling, sampling distributions, and parameter estimation

ESS Line Fitting

Midterm Exam 1, section 1 (Solution) Thursday, February hour, 15 minutes

Chapter 14 Logistic Regression Models

( ) = ( ) ( ) Chapter 13 Asymptotic Theory and Stochastic Regressors. Stochastic regressors model

ENGI 3423 Simple Linear Regression Page 12-01

Objectives of Multiple Regression

Multiple Linear Regression Analysis

Line Fitting and Regression

4. Standard Regression Model and Spatial Dependence Tests

Module 7. Lecture 7: Statistical parameter estimation

CLASS NOTES. for. PBAF 528: Quantitative Methods II SPRING Instructor: Jean Swanson. Daniel J. Evans School of Public Affairs

Simple Linear Regression - Scalar Form

CHAPTER 6. d. With success = observation greater than 10, x = # of successes = 4, and

Lecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model

Econometric Methods. Review of Estimation

Statistics MINITAB - Lab 5

Midterm Exam 1, section 2 (Solution) Thursday, February hour, 15 minutes

CS 2750 Machine Learning. Lecture 8. Linear regression. CS 2750 Machine Learning. Linear regression. is a linear combination of input components x

1. The weight of six Golden Retrievers is 66, 61, 70, 67, 92 and 66 pounds. The weight of six Labrador Retrievers is 54, 60, 72, 78, 84 and 67.

1 Solution to Problem 6.40

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. x, where. = y - ˆ " 1

Lecture 1: Introduction to Regression

i 2 σ ) i = 1,2,...,n , and = 3.01 = 4.01

Multiple Regression. More than 2 variables! Grade on Final. Multiple Regression 11/21/2012. Exam 2 Grades. Exam 2 Re-grades

Lecture Notes Forecasting the process of estimating or predicting unknown situations

b. There appears to be a positive relationship between X and Y; that is, as X increases, so does Y.

Analysis of Variance with Weibull Data

Ordinary Least Squares Regression. Simple Regression. Algebra and Assumptions.

Lecture 1: Introduction to Regression

Lecture 8: Linear Regression

ε. Therefore, the estimate

Multiple Choice Test. Chapter Adequacy of Models for Regression

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE

Chapter 11 The Analysis of Variance

Linear Regression with One Regressor

CHAPTER 2. = y ˆ β x (.1022) So we can write

Simple Linear Regression and Correlation.

Probability and. Lecture 13: and Correlation

Simple Linear Regression

STK4011 and STK9011 Autumn 2016

Sampling Theory MODULE V LECTURE - 14 RATIO AND PRODUCT METHODS OF ESTIMATION

Summary of the lecture in Biostatistics

Quantitative analysis requires : sound knowledge of chemistry : possibility of interferences WHY do we need to use STATISTICS in Anal. Chem.?

Chapter 2 Supplemental Text Material

STATISTICAL INFERENCE

residual. (Note that usually in descriptions of regression analysis, upper-case

Reaction Time VS. Drug Percentage Subject Amount of Drug Times % Reaction Time in Seconds 1 Mary John Carl Sara William 5 4

PGE 310: Formulation and Solution in Geosystems Engineering. Dr. Balhoff. Interpolation

Chapter Business Statistics: A First Course Fifth Edition. Learning Objectives. Correlation vs. Regression. In this chapter, you learn:

Special Instructions / Useful Data

LINEAR REGRESSION ANALYSIS

A Study of the Reproducibility of Measurements with HUR Leg Extension/Curl Research Line

Applied Statistics and Probability for Engineers, 5 th edition February 23, b) y ˆ = (85) =

MEASURES OF DISPERSION

ECONOMETRIC THEORY. MODULE VIII Lecture - 26 Heteroskedasticity

Maximum Likelihood Estimation

BAYESIAN INFERENCES FOR TWO PARAMETER WEIBULL DISTRIBUTION

Lecture 2: The Simple Regression Model

The equation is sometimes presented in form Y = a + b x. This is reasonable, but it s not the notation we use.

Chapter 13, Part A Analysis of Variance and Experimental Design. Introduction to Analysis of Variance. Introduction to Analysis of Variance

A New Family of Transformations for Lifetime Data

Example: Multiple linear regression. Least squares regression. Repetition: Simple linear regression. Tron Anders Moger

Machine Learning. Introduction to Regression. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012

: At least two means differ SST

Chapter 8. Inferences about More Than Two Population Central Values

Bayesian Inferences for Two Parameter Weibull Distribution Kipkoech W. Cheruiyot 1, Abel Ouko 2, Emily Kirimi 3

Functions of Random Variables

Chapter Two. An Introduction to Regression ( )

ENGI 4421 Propagation of Error Page 8-01

Multivariate Transformation of Variables and Maximum Likelihood Estimation

Lecture 2: Linear Least Squares Regression

Statistics: Unlocking the Power of Data Lock 5

{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution:

Fundamentals of Regression Analysis

Chapter 8: Statistical Analysis of Simulated Data

Lecture Notes 2. The ability to manipulate matrices is critical in economics.

Chapter 4 (Part 1): Non-Parametric Classification (Sections ) Pattern Classification 4.3) Announcements

Lecture Note to Rice Chapter 8

CHAPTER VI Statistical Analysis of Experimental Data

Chapter 13 Student Lecture Notes 13-1

Likelihood Ratio, Wald, and Lagrange Multiplier (Score) Tests. Soccer Goals in European Premier Leagues

Correlation and Simple Linear Regression

Bayesian Classification. CS690L Data Mining: Classification(2) Bayesian Theorem: Basics. Bayesian Theorem. Training dataset. Naïve Bayes Classifier

Qualifying Exam Statistical Theory Problem Solutions August 2005

Continuous Distributions

Example. Row Hydrogen Carbon

Statistics. Correlational. Dr. Ayman Eldeib. Simple Linear Regression and Correlation. SBE 304: Linear Regression & Correlation 1/3/2018

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

COV. Violation of constant variance of ε i s but they are still independent. The error term (ε) is said to be heteroscedastic.

Third handout: On the Gini Index

Recall MLR 5 Homskedasticity error u has the same variance given any values of the explanatory variables Var(u x1,...,xk) = 2 or E(UU ) = 2 I

Discrete Mathematics and Probability Theory Fall 2016 Seshia and Walrand DIS 10b

REVIEW OF SIMPLE LINEAR REGRESSION SIMPLE LINEAR REGRESSION

Transcription:

STA 08 Appled Lear Models: Regresso Aalyss Sprg 0 Soluto for Homework #. Let Y the dollar cost per year, X the umber of vsts per year. The the mathematcal relato betwee X ad Y s: Y 300 + X. Ths s a fuctoal relatoshp, ot a statstcal oe. If the umber of yearly spa vsts for a member s kow, the the eact dollar amout of that member s aual dues ca be calculated..6 (a) See Fg..6 o p.. Your pcture should look smlar. Plot E(Y) 00 + 5.0X Evaluate E(Y) at X 0, 0 ad 40, ad plot these pots. X 0 : E(Y) 00 + 5.0(0) 50 X 0 : E(Y) 00 + 5.0(0) 300 X 40 : E(Y) 00 + 5.0(40) 400 Sketch a ormal curve aroud each of these mea values to represet the dstrbuto of Y at each of the gve X values. Note that the varace of each probablty dstrbuto s the same (σ 6). (b) β 0 : the value of E(Y) at X 0 s β 0 00. β : for each ut crease X, E(Y) creases by β 5 uts..7 (a) No, sce the dstrbuto of Y s ukow. (b) Yes. Now Y has a ormal dstrbuto wth E(Y) 00 + 0(5) 00, Var(Y) 5..e., Y ~ N(00, 5) 95 00 05 00 P[95 Y 05] P Z P[ - Z ] 0.686 5 5. Over the specfed rage for X, from 40 to 00, there s a crease producto output after a employee takes a trag program. Ths s because the y-tercept b 0 s equal to 0. For eample, f the producto output was 40 before the trag, t wll be 58 after the trag. Also f the producto output was 00 before the trag, t wll be 5 after. However, lookg outsde the rage of X, f the producto was 000 before the trag, t would oly be 970 after the trag. There s oly a crease wth the rage of X..3 (a) The data are observatoal there was o cotrolled epermet. (b) The cocluso s ot vald. Oe caot make fereces about a causal relatoshp based o observatoal data. There could be cofoudg varables that are related to the creased employee productvty ad creased class preparato tme.

.3 (c) Eample : Taleted employees do t eed to sped much tme class preparato but stll have hgher productvty levels tha others. Eample : Readg techcal papers or searchg the web may decrease oe s class preparato tme. However, oe may stll have hgher productvty levels tha others. (d) A epermeter mght take a represetatve sample of good employees who do t sped much tme class for preparg. The partcpats would be radomly assged to oe of two groups. Group would be asked to sped several hours for preparato (say 4 hours per day) ad Group would be asked ot to eceed 4 hours of preparg per day. The amout of tme for preparato ad the productvty level would be recorded for each dvdual.. (a) Yˆ 0.0 + 4. 0X Arfreght breakage Y 0. + 4X R-Sq 90. % 0 amp 5 0 0 3 tras A lear regresso fucto appears to ft the data well. (b) Whe X, Yˆ 0. + 4() 4. (c) The crease the umber of trasfers (X) s. So, the crease the epected umber of broke ampules, E(Y), s estmated by b 4. (d) Calculate: X, Y 4. As we have see part (b), Yˆ 0. + 4() 4.. The X, Y. ftted regresso le does pass through the pot ( )

.5 MINITAB regresso for arfreght breakage data: The regresso equato s y broke 0. + 4.00 trasfers Predctor Coef StDev T P Costat 0.000 0.6633 5.38 0.000 trasf 4.0000 0.4690 8.53 0.000 S.483 R-Sq 90.% R-Sq(adj) 88.9% Aalyss of Varace Source DF SS MS F P Regresso 60.00 60.00 7.73 0.000 Resdual Error 8 7.60.0 Total 9 77.60 Obs trasf y broke Ft StDev Ft Resdual St Resd.00 6.000 4.00 0.469.800.8.9 (a) Ŷ 0. + 4() 4. e Y Ŷ 6 4..8 e s a estmate of ε, the vertcal devato of Y from the ukow true regresso le (b) Σe SSE 7.6 MSE.0 MSE estmates σ Assumg X 0 wth the scope of the model, the mplcato of the regresso fucto f β 0 s othg but we epect Y 0 ad the regresso fucto plot passes through the org. 0.38 a) For b 0 9 ad b 3, the crtero s: b) Smlarly, for b 0 ad b 5 0 ( (9 + 3 )) 76 Q y 0 ( ( + 5 )) 60 Q y YES! The crtera Q for these estmates, as epected, are larger tha for the least squares estmates..4 a) The least squares estmator of β s obtaed by mmzg the least square crtera, Q. Hece we eed to mmze:

Q ( y β ) To get the estmator, we take the dervatve of Q wth respect to β ad equate t to zero. dq dβ d dβ ( y β ) ( y β ) 0 Thus, solvg the above equato for β, evaluated at b, we get the least square estmator to be: b y b) Frst, the desty of a observato Y for the ormal error model, utlzg the fact that E{ Y } β ad σ { Y } σ s gve by: y β f ep πσ σ The lkelhood fucto for observatos Y, Y,..., Y s the product of the above dvdual destes. Sce the varace of the error termσ s kow ths problem, the lkelhood fucto s a fucto of β oly. Hece, L( β ) ep y β / (πσ ) σ ep / (πσ ) σ ( ) ( y β ) The mamum lkelhood estmator (MLE) of β s obtaed by mamzg the above lkelhood. Sce the value of β that mamze the above lkelhood also mamze LogL (β ), we fd the MLE from: LogL( β ) log πσ ( y β ) σ Net, takg the dervatve wth respect to β ad equatg to zero gve the MLE (Check also secod dervatve). That s, d LogL( β ) β dβ σ ( y ) 0

Solvg for β evaluated at b gve the MLE to be: b y c) A estmator b of β s ubased f E {b} β. From the regresso equato y β + ε, we deduce E{ y } β Thus, t follow E { b} E{ y } β β β Therefore, the MLE b s a ubased estmator of β..43 a) Let Y be the umber of actve physcas CDI X be The Total Populato X be Number of Hosptal Beds ad X 3 be Total Persoal Icome The estmated regresso fuctos of the umber of actve physcas o each of the predctors are gve by: Yˆ 0.635 + 0.003X Yˆ 95.93 + 0.743X Yˆ 48.395 + 0.3X b) Plot of regresso fuctos ad data. 3 Number of Actve Physcas 0 5000 0000 5000 0000 Number of Actve Physcas 0 5000 0000 5000 0000 0 *0^6 4*0^6 6*0^6 8*0^6 Total Populato 0 5000 0000 5000 0000 5000 Number of Hosptal Beds

Number of Actve Physcas 0 5000 0000 5000 0000 Yes, the lear regresso relato appears to provde a good ft for each of the three predctor varables. However, the two data pots whch are out of the scatter should be see wth cauto. 0 50000 00000 50000 Total Persoal Icome c) The mea square error for each predctor varables ca be obtaed from the aalyss of varace table. Thus, Predctors MSE X 3704 X 309 X 3 34539 So based o the MSE, we ca deduce that regresso model that cota X (the umber of hosptal beds) has the smallest varablty aroud the ftted regresso le..44 Let Y stads for per capta come ad also let X represet the percetage of dvduals havg at least bachelor s degree. a) The estmated regresso fucto for each rego s gve by: Rego NE NC S W Estmated MSE Regresso Fucto Yˆ 93.8 + 5. 6X 7,335,008 Yˆ 358.4 + 38. 67X 4,4,34 Yˆ 059.79 + 330. 6X 7,474,349 Yˆ 865.05 + 440. 3 8,4,38 X

b) As to the smlarty of the regresso fuctos, terms of the drecto of relatoshp betwee per capta come ad percetage of dvdual havg at least a bachelor s degree t s the same all regos. A ut crease percetage of bachelor s degree result a crease the per capta come. The rate of cremet, however, dffers amog the regos. For stace, the relatve rate of cremet per capta come for a ut crease percetage of dvduals s hgher NE ad smaller NC. c) The MSE for each rego s show colum 3 above. There s a dfferece the varablty aroud the ftted regresso le amog the groups. The varablty s relatvely hgher W ad smaller NC.