Chapter 2 Simple Linear Regression

Size: px
Start display at page:

Download "Chapter 2 Simple Linear Regression"

Transcription

1 Chapter Smple Lear Regresso. Itroducto ad Least Squares Estmates Regresso aalyss s a method for vestgatg the fuctoal relatoshp amog varables. I ths chapter we cosder problems volvg modelg the relatoshp betwee two varables. These problems are commoly referred to as smple lear regresso or straght-le regresso. I later chapters we shall cosder problems volvg modelg the relatoshp betwee three or more varables. I partcular we ext cosder problems volvg modelg the relatoshp betwee two varables as a straght le, that s, whe Y s modeled as a lear fucto of X. Example: A regresso model for the tmg of producto rus We shall cosder the followg example take from Foster, Ste ad Waterma (997, pages 9 99) throughout ths chapter. The orgal data are the form of the tme take ( mutes) for a producto ru, Y, ad the umber of tems produced, X, for radomly selected orders as supervsed by three maagers. At ths stage we shall oly cosder the data for oe of the maagers (see Table. ad Fgure. ). We wsh to develop a equato to model the relatoshp betwee Y, the ru tme, ad X, the ru sze. A scatter plot of the data lke that gve Fgure. should ALWAYS be draw to obta a dea of the sort of relatoshp that exsts betwee two varables (e.g., lear, quadratc, expoetal, etc.)... Smple Lear Regresso Models Whe data are collected pars the stadard otato used to desgate ths s: (x, y ),(x, y ),...,(x, y ) where x deotes the frst value of the so-called X -varable ad y deotes the frst value of the so-called Y -varable. The X -varable s called the explaatory or predctor varable, whle the Y -varable s called the respose varable or the depedet varable. The X -varable ofte has a dfferet status to the Y -varable that: S.J. Sheather, A Moder Approach to Regresso wth R, 5 DOI:.7/ _, Sprger Scece + Busess Meda LLC 9

2 6 Smple Lear Regresso Table. Producto data (producto.txt) Case Ru tme Ru sze Case Ru tme Ru sze Ru Tme 6 Fgure. A scatter plot of the producto data 5 3 Ru Sze It ca be thought of as a potetal predctor of the Y-varable Its value ca sometmes be chose by the perso udertakg the study Smple lear regresso s typcally used to model the relatoshp betwee two varables Y ad X so that gve a specfc value of X, that s, X = x, we ca predct the value of Y. Mathematcally, the regresso of a radom varable Y o a radom varable X s E(Y X = x), the expected value of Y whe X takes the specfc value x. For example, f X = Day of the week ad Y = Sales at a gve compay, the the regresso of Y o X represets the mea (or average) sales o a gve day. The regresso of Y o X s lear f

3 . Itroducto ad Least Squares Estmates 7 E( Y X = x) = b + b x (.) where the ukow parameters b ad b determe the tercept ad the slope of a specfc straght le, respectvely. Suppose that Y, Y,, Y are depedet realzatos of the radom varable Y that are observed at the values x, x,, x of a radom varable X. If the regresso of Y o X s lear, the for =,,, Y = E( Y X = x) + e = b + b x+ e where e s the radom error Y ad s such that E(e X) =. The radom error term s there sce there wll almost certaly be some varato Y due strctly to radom pheomeo that caot be predcted or explaed. I other words, all uexplaed varato s called radom error. Thus, the radom error term does ot deped o x, or does t cota ay formato about Y (otherwse t would be a systematc error). We shall beg by assumg that V ar ( Y X = x ) = s. (.) I Chapter 4 we shall see how ths last assumpto ca be relaxed. Estmatg the populato slope ad tercept Suppose for example that X = heght ad Y = weght of a radomly selected dvdual from some populato, the for a straght le regresso model the mea weght of dvduals of a gve heght would be a lear fucto of that heght. I practce, we usually have a sample of data stead of the whole populato. The slope b ad tercept b are ukow, sce these are the values for the whole populato. Thus, we wsh to use the gve data to estmate the slope ad the tercept. Ths ca be acheved by fdg the equato of the le whch best fts our data, that s, choose b ad b such that yˆ = b + bx s as close as possble to y. Here the otato ŷ s used to deote the value of the le of best ft order to dstgush t from the observed values of y, that s, y. We shall refer to ŷ as the th predcted value or the ftted value of y. Resduals I practce, we wsh to mmze the dfferece betwee the actual value of y (y ) ad the predcted value of y (ŷ ). Ths dfferece s called the resdual, ê, that s, ê = y ŷ. Fgure. shows a hypothetcal stuato based o sx data pots. Marked o ths plot s a le of best ft, ŷ alog wth the resduals. Least squares le of best ft A very popular method of choosg b ad b s called the method of least squares. As the ame suggests b ad b are chose to mmze the sum of squared resduals (or resdual sum of squares [RSS]),

4 8 Smple Lear Regresso 5 ê 6 Y 5 ê 3 ê4 ê 5 Le of best ft ê ê X Fgure. A scatter plot of data wth a le of best ft ad the resduals detfed eˆ ˆ y y y b bx = = = RSS = = ( ) = ( ). For RSS to be a mmum wth respect to b ad b we requre RSS = ( y b bx) = b = ad RSS = x( y b bx) = b = Rearragg terms these last two equatos gves ad y = b + b x = = xy = bx + bx = = =. These last two equatos are called the ormal equatos. Solvg these equatos for b ad b gves the so-called least squares estmates of the tercept bˆ = y bˆ x (.3)

5 . Itroducto ad Least Squares Estmates 9 ad the slope x y xy ( x x)( y y) SXY ˆ = = = = =. SXX x x ( x x) = = b (.4) Regresso Output from R The least squares estmates for the producto data were calculated usg R, gvg the followg results: Coeffcets: Estmate Std. Error t value Pr(> t ) (Itercept) e-3 *** RuSze e-6 *** --- Sgf. codes: ***. **. *.5.. Resdual stadard error: 6.5 o 8 degrees of freedom Multple R-Squared:.73, Adjusted R-squared:.75 F-statstc: 48.7 o ad 8 DF, p-value:.65e-6 The least squares le of best ft for the producto data Fgure.3 shows a scatter plot of the producto data wth the least squares le of best ft. The equato of the least squares le of best ft s y = x. Let us look at the results that we have obtaed from the le of best ft Fgure.3. The tercept Fgure.3 s 49.7, whch s where the le of best ft crosses the ru tme axs. The slope of the le Fgure.3 s.6. Thus, we say that each addtoal ut to be produced s predcted to add.6 mutes to the ru tme. The tercept the model has the followg terpretato: for ay producto ru, the average set up tme s 49.7 mutes. Estmatg the varace of the radom error term Cosder the lear regresso model wth costat varace gve by (.) ad (.). I ths case, Y = b + b x + e ( =,,..., ) where the radom error e has mea ad varace s. We wsh to estmate s = Var(e). Notce that e = Y ( b + b x ) = Y ukow regresso le at x.

6 Smple Lear Regresso 4 Ru Tme Ru Sze Fgure.3 A plot of the producto data wth the least squares le of best ft Sce b ad b are ukow all we ca do s estmate these errors by replacg b ad b by ther respectve least squares estmates ad gvg the resduals bˆ bˆ eˆ = Y ( bˆ + bˆ x ) = Y estmated regresso le at x. These resduals ca be used to estmate s. I fact t ca be show that S RSS = = eˆ = s a ubased estmate of s. Two pots to ote are:. e ˆ = (sce e ˆ = as the least squares estmates mmze RSS = eˆ ). The dvsor S s sce we have estmated two parameters, amely b ad b.. Ifereces About the Slope ad the Itercept I ths secto, we shall develop methods for fdg cofdece tervals ad for performg hypothess tests about the slope ad the tercept of the regresso le.

7 . Ifereces About the Slope ad the Itercept.. Assumptos Necessary Order to Make Ifereces About the Regresso Model Throughout ths secto we shall make the followg assumptos:. Y s related to x by the smple lear regresso model Y = b + b x + e ( =,..., ),.e., E( Y X = x ) = b + bx. The errors e, e,..., e are depedet of each other 3. The errors e, e,..., e have a commo varace s 4. The errors are ormally dstrbuted wth a mea of ad varace s, that s, e X~ N(, s ) Methods for checkg these four assumptos wll be cosdered Chapter 3. I addto, sce the regresso model s codtoal o X we ca assume that the values of the predctor varable, x, x,, x are kow fxed costats... Ifereces About the Slope of the Regresso Le Recall from (.4) that the least squares estmate of b s gve by bˆ x y xy ( x x)( y y) = = = = = x x ( x x) = = SXY SXX Sce, ( x x) = we fd that = ( x x)( y y) = ( x x) y y ( x x) = ( x x) y = = = = Thus, we ca rewrte bˆ as ˆ x x b = cy where c = (.5) SXX = We shall see that ths verso of wll be used wheever we study ts theoretcal bˆ propertes. Uder the above assumptos, we shall show Secto.7 that E( bˆ X ) = b (.6) s Var( b ˆ X) = SXX (.7)

8 Smple Lear Regresso s b ˆ b X~ N, SXX (.8) Note that (.7) the varace of the least squares slope estmate decreases as SXX creases (.e., as the varablty the X s creases). Ths s a mportat fact to ote f the expermeter has cotrol over the choce of the values of the X varable. Stadardzg (.8) gves bˆ Z = s b SXX ~ N(,) If s were kow the we could use a Z to test hypotheses ad fd cofdece tervals for b. Whe s s ukow (as s usually the case) replacg s by S, the stadard devato of the resduals results bˆ b bˆ b T = = S se( bˆ ) SXX where se ( b ˆ ) = S s the estmated stadard error (se) of, whch s gve bˆ SXX drectly by R. I the producto example the X -varable s RuSze ad so se (bˆ ) =.374. It ca be show that uder the above assumptos that T has a t-dstrbuto wth degrees of freedom, that s bˆ b T = se( ˆ ) ~ t b Notce that the degrees of freedom satsfes the followg formula degrees of freedom = sample sze umber of mea parameters estmated. I ths case we are estmatg two such parameters, amely, b ad b. For testg the hypothess H : b = b the test statstc s bˆ b T = ~ t whe s true. se( ˆ H b ) R provdes the value of T ad the p -value assocated wth testg H : b = agast H A : b (.e., for the choce b = ). I the producto example the X-varable s RuSze ad T = 6.98, whch results a p -value less tha.. A ( a) % cofdece terval for b, the slope of the regresso le, s gve by

9 . Ifereces About the Slope ad the Itercept 3 ( b ˆ t( a/, -)se( b ˆ ), b ˆ + t( a/, -)se( b ˆ )) where t(a /, ) s the ( a / )th quatle of the t -dstrbuto wth degrees of freedom. I the producto example the X -varable s RuSze ad bˆ ˆ =.594, se( b ) =.374, t (.5, = 8) =.9. Thus a 95% cofdece terval for b s gve by (.594 ±.9.374) = (.594 ±.783) = (.8,.337)..3 Ifereces About the Itercept of the Regresso Le Recall from (.3) that the least squares estmate of b s gve by bˆ = y bˆ x Uder the assumptos gve prevously we shall show Secto.7 that ˆ E( b X) b = (.9) b ˆ x X = s + Var( ) SXX (.) SXX ˆ x X~ N b, s + b (.) Stadardzg (.) gves Z = s bˆ b + x SXX ~ N(,) If s were kow the we could use Z to test hypotheses ad fd cofdece tervals for b. Whe s s ukow (as s usually the case) replacg σ by S results bˆ b bˆ b T = = ~ t ˆ x se( b ) S + SXX where se ( b ˆ ) = S x + SXX s the estmated stadard error of bˆ, whch s gve drectly by R. I the producto example the tercept s called Itercept ad so se(bˆ ) =

10 4 Smple Lear Regresso For testg the hypothess H : b = b the test statstc s bˆ b T = ~ t whe s true. se( ˆ H b ) R provdes the value of T ad the p -value assocated wth testg H : b = agast H A : b. I the producto example the tercept s called Itercept ad T = 7.98 whch results a p -value <.. A ( a )% cofdece terval for b, the tercept of the regresso le, s gve by ( b ˆ t( a/, ) se( b ˆ ), b ˆ + t( a /, )se( b ˆ )) where t(a /, ) s the ( a / ) th quatle of the t -dstrbuto wth degrees of freedom. I the producto example, bˆ = , se( bˆ ) = 8.385, t(.5, = 8) =.9. Thus a 95% cofdece terval for b s gve by ( ± ) = ( ± 7.497) = (3.3,67.) Regresso Output from R: 95% cofdece tervals.5% 97.5% (Itercept) RuSze Cofdece Itervals for the Populato Regresso Le I ths secto we cosder the problem of fdg a cofdece terval for the ukow populato regresso le at a gve value of X, whch we shall deote by x *. Frst, recall from (.) that the populato regresso le at X = x * s gve by E( Y X = x*) = b + b x* A estmator of ths ukow quatty s the value of the estmated regresso equato at X = x *, amely, yˆ* = bˆ + bˆ x* Uder the assumptos stated prevously, t ca be show that E( yˆ*) = E( yˆ X = x*) = b + b x* (.)

11 .4 Predcto Itervals for the Actual Value of Y 5 ( x* x) Var( yˆ*) = Var( yˆ X = x*) = s + SXX (.3) ( x* x) yˆ* = yˆ X = x* N b + bx*, s + SXX (.4) Stadardzg (.4) gves Z = yˆ * ( b + bx*) N(,) s ( x* x) ( + ) SXX Replacg s by S results yˆ * ( b + bx*) T = t ( x* x) S ( + ) SXX A ( a)% cofdece terval for E( Y X = x*) = b + bx*, the populato regresso le at X = x *, s gve by ( x* x) yˆ * ± t( a/, ) S ( + ) SXX ˆ ˆ ( x* x) = b + b x* ± t( a/, ) S ( + ) SXX where t( a/, s ) the ( a/)th quatle of the t -dstrbuto wth degrees of freedom..4 Predcto Itervals for the Actual Value of Y I ths secto we cosder the problem of fdg a predcto terval for the actual value of Y at x *, a gve value of X. Importat Notes:. E( Y X = x*), the expected value or average value of Y for a gve value x * of X, s what oe would expect Y to be the log ru whe X = x *. E( Y X = x*) s therefore a fxed but ukow quatty whereas Y ca take a umber of values whe X = x *.

12 6 Smple Lear Regresso. E(Y X = x*), the value of the regresso le at X = x *, s etrely dfferet from Y *, a sgle value of Y whe X = x *. I partcular, Y * eed ot le o the populato regresso le. 3. A cofdece terval s always reported for a parameter (e.g., E(Y X = x*) = b + b x* ) ad a predcto terval s reported for the value of a radom varable (e.g., Y *). We base our predcto of Y whe X = x * (that s of Y *) o The error our predcto s yˆ* = bˆ + bˆ x* Y* yˆ* = b + b x* + e* yˆ* = E( Y X = x*) yˆ* + e* that s, the devato betwee E(Y X = x*) ad ŷ* plus the radom fluctuato e* (whch represets the devato of Y * from E(Y X = x*)). Thus the varablty the error for predctg a sgle value of Y wll exceed the varablty for estmatg the expected value of Y (because of the radom error e *). It ca be show that uder the prevously stated assumptos that E( Y* yˆ*) = E( Y yˆ X = x*) = (.5) ( x* x) Var( Y* yˆ*) = Var( Y yˆ X = x*) = s + + SXX (.6) ( x* x) Y* yˆ * ~ N, s + + SXX (.7) Stadardzg (.7) ad replacg s by S gves T = S Y* yˆ * ( x* x) ( + + ) SXX ~ t A ( a)% predcto terval for Y *, the value of Y at X = x *, s gve by ( x* x) yˆ * ± t( a/, ) S ( + + ) SXX ˆ ˆ ( x* x) = b + b x* ± t( a/, ) S ( + + ) SXX

13 .5 Aalyss of Varace 7 where t(a /, ) s the ( a / )th quatle of the t -dstrbuto wth degrees of freedom. Regresso Output from R Nety-fve percet cofdece tervals for the populato regresso le (.e., the average RuTme) at RuSze = 5,, 5,, 5, 3, 35 are: ft lwr upr Nety-fve percet predcto tervals for the actual value of Y (.e., the actual RuTme) at at RuSze = 5,, 5,, 5, 3, 35 are: ft lwr upr Notce that each predcto terval s cosderably wder tha the correspodg cofdece terval, as s expected..5 Aalyss of Varace There s a lear assocato betwee Y ad x f Y = b + b x + e ad b. If we kew that b the we would predct Y by ŷ = bˆ + bˆ x O the other had, f we kew that b = the we predct Y by ŷ = y To test whether there s a lear assocato betwee Y ad X we have to test H : b = agast H A : b.

14 8 Smple Lear Regresso We ca perform ths test usg the followg t-statstc bˆ = T t se( bˆ whe H ) s true. We ext look at a dfferet test statstc whch ca be used whe there s more tha oe predctor varable, that s, multple regresso. Frst, we troduce some termology. Defe the total corrected sum of squares of the Y s by SST = SYY = ( y y) Recall that the resdual sum of squares s gve by RSS = ( y yˆ ) Defe the regresso sum of squares (.e., sum of squares explaed by the regresso model) by SSreg = ( yˆ y) It s clear that SSreg s close to zero f for each, ŷ s close to ȳ whle SSreg s large f ŷ dffers from ȳ for most values of x. We ext look at the hypothetcal stuato Fgure.4 wth just a sgle data pot ( x, y ) show alog wth the least squares regresso le ad the mea of y based o all data pots. It s apparet from Fgure.4 that y ( ˆ ) ( ˆ y = y y + y y). Further, t ca be show that SST = SSreg + RSS Total sample = Varablty explaed by + Uexplaed (or error) varablty the model varablty See exercse 6 Secto.7 for detals. If Y = b + b x+ e ad b the RSS should be small ad SSreg should be close to SST. But how small s small ad how close s close?

15 .5 Aalyss of Varace 9 Fgure.4 Graphcal depcto that y y = ( y yˆ) + ( yˆ y ) To test we ca use the test statstc H : b = agast H A : b F = SSreg / RSS /( ) sce RSS has ( ) degrees of freedom ad SSreg has degree of freedom. Uder the assumpto that e, e,..., e are depedet ad ormally dstrbuted wth mea ad varace s, t ca be show that F has a F dstrbuto wth ad degrees of freedom whe H s true, that s, F = SSreg / ~ RSS /( ) F, whe H s true Form of test: reject H at level a f F > F a,, (whch ca be obtaed from table of the F dstrbuto). However, all statstcal packages report the correspodg p-value.

16 3 Smple Lear Regresso The usual way of settg out ths test s to use a Aalyss of varace table Source of varato Degrees of freedom (df) Sum of squares (SS) Mea square (MS) Regresso SSreg SSreg/ SSreg / F = RSS /( ) Resdual RSS RSS/( ) Total SST F Notes:. It ca be show that the case of smple lear regresso bˆ T = ~ se( bˆ ) SSreg / ad F = ~ F, are related va F = T RSS /( ) t. R, the coeffcet of determato of the regresso le, s defed as the proporto of the total sample varablty the Y s explaed by the regresso model, that s, SSreg RSS R = = SST SST The reaso ths quatty s called R s that t s equal to the square of the correlato betwee Y ad X. It s arguably oe of the most commoly msused statstcs. Regresso Output from R Aalyss of Varace Table Respose: RuTme Df Sum Sq Mea Sq F value Pr(>F) RuSze e-6 *** Resduals Sgf. codes: ***. **. *.5.. Notce that the observed F -value of s just the square of the observed t-value 6.98 whch ca be foud betwee Fgures. ad.3. We shall see Chapter 5 that Aalyss of Varace overcomes the problems assocated wth multple t-tests whch occur whe there are may predctor varables..6 Dummy Varable Regresso So far we have oly cosdered stuatos whch the predctor or X-varable s quattatve (.e., takes umercal values). We ext cosder so-called dummy varable regresso, whch s used ts smplest form whe a predctor s categorcal

17 .6 Dummy Varable Regresso 3 wth two values (e.g., geder) rather tha quattatve. The resultg regresso models allow us to test for the dfferece betwee the meas of two groups. We shall see a later topc that the cocept of a dummy varable ca be exteded to clude problems volvg more tha two groups. Usg dummy varable regresso to compare ew ad old methods We shall cosder the followg example throughout ths secto. It s take from Foster, Ste ad Waterma (997, pages 4 48). I ths example, we cosder a large food processg ceter that eeds to be able to swtch from oe type of package to aother quckly to react to chages order patters. Cosultats have developed a ew method for chagg the producto le ad used t to produce a sample of 48 chage-over tmes ( mutes). Also avalable s a depedet sample of 7 chage-over tmes ( mutes) for the exstg method. These two sets of tmes ca be foud o book web ste the fle called chageover_tmes. txt. The frst three ad the last three rows of the data from ths fle are reproduced below Table.. Plots of the data appear Fgure.5. We wsh to develop a equato to model the relatoshp betwee Y, the chage-over tme ad X, the dummy varable correspodg to New ad hece test whether the mea chage-over tme s reduced usg the ew method. We cosder the smple lear regresso model Y = b + b x+ e where Y = chage-over tme ad x s the dummy varable (.e., x = f the tme correspods to the ew chage-over method ad f t correspods to the exstg method). Regresso Output from R Coeffcets: Estmate Std. Error t value Pr(> t ) (Itercept) <e-6 *** New * --- Sgf. codes: ***. **. *.5.. Resdual stadard error: o 8 degrees of freedom Multple R-Squared:.48, Adjusted R-squared:.335 F-statstc: 5.8 o ad 8 DF, p-value:.64 We ca test whether there s sgfcat reducto the chage-over tme for the ew method by testg the sgfcace of the dummy varable, that s, we wsh to test whether the coeffcet of x s zero or less tha zero, that s: H : b = agast H A : b < We use the oe-sded < alteratve sce we are terested whether the ew method has lead to a reducto mea chage-over tme. The test statstc s bˆ = se( bˆ ) T ~ t H whe s true.

18 3 Smple Lear Regresso Table. Chage-over tme data (chageover_tmes.txt) Method Y, Chage-over tme X, New Exstg 9 Exstg 4 Exstg New 4 New 4 New 35 Chage Over Tme Chage Over Tme Dummy Varable, New Dummy Varable, New Chage Over Tme Exstg New Method Fgure.5 A scatter plot ad box plots of the chage-over tme data I ths case, T =.54. (Ths result ca be foud the output the colum headed t value ). The assocated p -value s gve by.6 p value = P( T <.54 whe H s true) = =.3 as the two-sded p- value = P( T.54 whe H s true) =.6. Ths meas that there s sgfcat evdece of a reducto the mea chageover tme for the ew method.

19 .7 Dervatos of Results 33 Next cosder the group cosstg of those tmes assocated wth the ew chage-over method. For ths group, the dummy varable, x s equal to. Thus, we ca estmate the mea chage-over tme for the ew method as: ( 3.736) = = 4.7 mutes Next cosder the group cosstg of those tmes assocated wth the exstg chage-over method. For ths group, the dummy varable, x s equal to. Thus, we ca estmate the mea chage-over tme for the ew method as: ( 3.736) = 7.86 = 7.9 mutes The ew chage-over method produces a reducto the mea chage-over tme of 3. m from 7.9 to 4.7 mutes (Notce that the reducto the mea chageover tme for the ew method s just the coeffcet of the dummy varable.) Ths reducto s statstcally sgfcat. A 95% cofdece terval for the reducto mea chage-over tme due to the ew method s gve by ( b ˆ t( a/, )se( b ˆ ), b ˆ + t( a/, )se( b ˆ )) where t( a /, ) s the ( a/ ) th quatle of the t -dstrbuto wth degrees of freedom. I ths example the X -varable s the dummy varable New ad b ˆ = 3.736, se( b ˆ ) =.48, t(.5, = 8) = Thus a 95% cofdece terval for b ( mutes) s gve by ( ± ) = ( ±.7883) = ( 5.96,.39). Fally, the compay should adopt the ew method f a reducto of tme of ths sze s of practcal sgfcace..7 Dervatos of Results I ths secto, we shall derve some results gve earler about the least squares estmates of the slope ad the tercept as well as results about cofdece tervals ad predcto tervals. Throughout ths secto we shall make the followg assumptos:. Y s related to x by the smple lear regresso model Y = b + bx + e ( =,..., ), e..,e( Y X = x) = b + bx. The errors e,e,...,e are depedet of each other 3. The errors e,e,...,e have a commo varace s 4. The errors are ormally dstrbuted wth a mea of ad varace s (especally whe the sample sze s small), that s, e X~ N(, s )

20 34 Smple Lear Regresso I addto, sce the regresso model s codtoal o X we ca assume that the values of the predctor varable, x, x,, x are kow fxed costats..7. Ifereces about the Slope of the Regresso Le Recall from (.5) that the least squares estmate of b s gve by ˆ x x b = cy where c =. SXX = Uder the above assumptos we shall derve (.6), (.7) ad (.8). To derve (.6) let s cosder sce ˆ E( b X) = E cy X = x = = = [ ] = ce y X= x ( b b x) = c + = b c + b c x = = x x x x + x = SXX = SXX =b b = b ( x x) = ad ( x x) x = x x = SXX. = = = To derve (.7) let s cosder ˆ Var( b X) = Var cy X = x = = c Var( y X = x ) = =σ c = x x =σ = SXX σ = SXX

21 .7 Dervatos of Results 35 Fally we derve (.8). Uder assumpto (4), the errors e X are ormally dstrbuted. Sce y = b + b x + e ( =,,..., ), Y X s ormally dstrbuted. Sce b ˆ X s a lear combato of the y s, b ˆ X s ormally dstrbuted..7. Ifereces about the Itercept of the Regresso Le Recall from (.3) that the least squares estmate of b s gve by b ˆ = y b ˆ x. Uder the assumptos gve prevously we shall derve (.9), (.) ad (.). To derve (.9) we shall use the fact that The frst pece of the last equato s E( b ˆ X ) = E( y X ) E( b ˆ X ) x E( y X) = E( y X = x) = = E( b + bx + e) = = b + b = b + b x = x The secod pece of that equato s E( bˆ Xx ) = b x. Thus, E( bˆ X) = E( y X) E( bˆ X) x = b + b x b x = b To derve (.) let s cosder Var( bˆ ˆ X) = Var( y bx X) = + ˆ The frst term s gve by Var( y X) x Var( b X) xcov( y, b X) s s Var( y X) = Var( y X = x) = =. = ˆ

22 36 Smple Lear Regresso From (.7), ˆ s Var( b X) = SXX Fally, So, ˆ s Cov( y, b X) = Cov y, c y = ccov( y, y ) = c = = = = = b ˆ x X = s + Var( ) SXX Result (.) follows from the fact that uder assumpto (4), Y X (ad hece ȳ ) are ormally dstrbuted as s b ˆ X..7.3 Cofdece Itervals for the Populato Regresso Le Recall that the populato regresso le at X = x * s gve by E( Y X = x*) = b + b x* A estmator the populato regresso le at X = x * (.e., E( Y X = x*) = b + bx* ) s the value of the estmated regresso equato at X = x *, amely, yˆ* = bˆ + bˆ x* Uder the assumptos stated prevously, we shall derve (.), (.3) ad (.4). Frst, otce that (.) follows from the followg earler establshed results E( bˆ X = x*) = b ad E( bˆ X = x*) = b. Next, cosder (.3) Var( yˆ X = x*) = Var( bˆ + bˆ x X = x*) = Var( bˆ X = x*) + x* Var( bˆ X = x*) + x* Cov( b ˆ, b ˆ X = x *) Now, Cov( bˆ, bˆ X = x*) = Cov( y bˆ x, bˆ X = x*) = Cov( y, bˆ X = x*) xcov( bˆ ˆ, b) = x Var( bˆ ) xs = SXX

23 .7 Dervatos of Results 37 So that, x s x* xs Var( yˆ X = x*) = s + + x* SXX SXX SXX s ( x* x) SXX = + Result (.4) follows from the fact that uder assumpto (4), b ˆ X s ormally dstrbuted as s b ˆ X..7.4 Predcto Itervals for the Actual Value of Y We base our predcto of Y whe X = x * (that s of Y *) o The error our predcto s yˆ* = bˆ + bˆ x* Y* yˆ* = b + b x* + e* yˆ* = E( Y X = x*) yˆ* + e* that s, the devato betwee E( Y X = x*) ad ŷ* plus the radom fluctuato e* (whch represets the devato of Y * from E( Y X = x*) ). Uder the assumptos stated prevously, we shall derve (.5), (.6) ad (.7). Frst, we cosder (.5) E( Y* yˆ*) = E( Y yˆ X = x*) = E( Y X = x*) E( bˆ ˆ + bx X = x* ) = I cosderg (.6), otce that ŷ s depedet of Y *, a future value of Y. Thus, Var( Y* yˆ*) = Var( Y yˆ X = x*) = Var( Y X = x*) + Var( yˆ X = x*) Cov( Y, yˆ X = x*) = s + s + ( x* x) SXX = s + + ( x* x) SXX Fally, (.7) follows sce both ŷ ad Y * are ormally dstrbuted.

24 38 Smple Lear Regresso.8 Exercses. The web ste provdes weekly reports o the box offce tcket sales for plays o Broadway New York. We shall cosder the data for the week October 7, 4 (referred to below as the curret week). The data are the form of the gross box offce results for the curret week ad the gross box offce results for the prevous week (.e., October 3, 4). The data, plotted Fgure.6, are avalable o the book web ste the fle playbll.csv. Ft the followg model to the data: Y = b + bx+ e where Y s the gross box offce results for the curret week ( $) ad x s the gross box offce results for the prevous week ( $). Complete the followg tasks: (a) Fd a 95% cofdece terval for the slope of the regresso model, b. Is a plausble value for b? Gve a reaso to support your aswer. (b) Test the ull hypothess H : b = agast a two-sded alteratve. Iterpret your result. (c) Use the ftted regresso model to estmate the gross box offce results for the curret week ( $) for a producto wth $4, gross box offce the prevous week. Fd a 95% predcto terval for the gross box offce Gross Box Offce Results Curret Week Gross Box Offce Results Prevous Week Fgure.6 Scatter plot of gross box offce results from Broadway

25 .8 Exercses 39 results for the curret week ( $) for a producto wth $4, gross box offce the prevous week. Is $45, a feasble value for the gross box offce results the curret week, for a producto wth $4, gross box offce the prevous week? Gve a reaso to support your aswer. (d) Some promoters of Broadway plays use the predcto rule that ext week s gross box offce results wll be equal to ths week s gross box offce results. Commet o the approprateess of ths rule.. A story by James R. Hagerty ettled Wth Buyers Sdeled, Home Prces Slde publshed the Thursday October 5, 7 edto of the Wall Street Joural cotaed data o so-called fudametal housg dcators major real estate markets across the US. The author argues that prces are geerally fallg ad overdue loa paymets are plg up. Thus, we shall cosder data preseted the artcle o Y = Percetage chage average prce from July 6 to July 7 (based o the S&P/Case-Shller atoal housg dex); ad x = Percetage of mortgage loas 3 days or more overdue latest quarter (based o data from Equfax ad Moody s). The data are avalable o the book web ste the fle dcators.txt. Ft the followg model to the data: Y = b + bx+ e. Complete the followg tasks: (a) Fd a 95% cofdece terval for the slope of the regresso model, b. O the bass of ths cofdece terval decde whether there s evdece of a sgfcat egatve lear assocato. (b) Use the ftted regresso model to estmate E ( Y X =4). Fd a 95% cofdece terval for E ( Y X =4). Is % a feasble value for E ( Y X =4)? Gve a reaso to support your aswer. 3. The maager of the purchasg departmet of a large compay would lke to develop a regresso model to predct the average amout of tme t takes to process a gve umber of voces. Over a 3-day perod, data are collected o the umber of voces processed ad the total tme take ( hours). The data are avalable o the book web ste the fle voces.txt. The followg model was ft to the data: Y = b + bx+ e where Y s the processg tme ad x s the umber of voces. A plot of the data ad the ftted model ca be foud Fgure.7. Utlzg the output from the ft of ths model provded below, complete the followg tasks. (a) Fd a 95% cofdece terval for the start-up tme,.e., b. (b) Suppose that a best practce bechmark for the average processg tme for a addtoal voce s. hours (or.6 mutes). Test the ull hypothess H : b =. agast a two-sded alteratve. Iterpret your result. (c) Fd a pot estmate ad a 95% predcto terval for the tme take to process 3 voces.

26 4 Smple Lear Regresso Processg Tme Fgure.7 Scatter plot of the voce data Number of Ivoces Regresso output from R for the voce data Call: lm(formula = Tme ~ Ivoces) Coeffcets: Estmate Std. Error t value Pr(> t ) (Itercept) e-5 *** Ivoces e-4 *** --- Resdual stadard error:.398 o 8 degrees of freedom Multple R-Squared:.878, Adjusted R-squared:.867 F-statstc: 9.4 o ad 8 DF, p-value: 5.75e-4 mea(tme). meda(tme) mea(ivoces) 3. meda(ivoces) Straght-le regresso through the org: I ths questo we shall make the followg assumptos: () Y s related to x by the smple lear regresso model Y = bx + e ( =,,..., ),.e., E( Y X = x ) = bx

27 .8 Exercses 4 () The errors e, e,..., e are depedet of each other (3) The errors e, e,..., e have a commo varace s (4) The errors are ormally dstrbuted wth a mea of ad varace s (especally whe the sample sze s small),.e., e X~ N(, s ) I addto, sce the regresso model s codtoal o X we ca assume that the values of the predctor varable, x, x,, x are kow fxed costats. (a) Show that the least squares estmate of b s gve by bˆ = = = xy x (b) Uder the above assumptos show that () () () E( bˆ X) = b Var( bˆ X) = = = s bˆ X ~ N ( b, s ) x x 5. Two alteratve straght le regresso models have bee proposed for Y. I the frst model, Y s a lear fucto of x, whle the secod model Y s a lear fucto of x. The plot the frst colum of Fgure.8 s that of Y agast x, whle the plot the secod colum below s that of Y agast x. These plots also show the least squares regresso les. I the followg statemets RSS stads for resdual sum of squares, whle SSreg stads for regresso sum of squares. Whch oe of the followg statemets s true? (a) RSS for model s greater tha RSS for model, whle SSreg for model s greater tha SSreg for model. (b) RSS for model s less tha RSS for model, whle SSreg for model s less tha SSreg for model. (c) RSS for model s greater tha RSS for model, whle SSreg for model s less tha SSreg for model. (d) RSS for model s less tha RSS for model, whle SSreg for model s greater tha SSreg for model. Gve a detaled reaso to support your choce.

28 4 Smple Lear Regresso Model Model y y x x Fgure.8 Scatter plots ad least squares les 6. I ths problem we wll show that SST=SSreg+RSS. To do ths we wll show that = ( y yˆ )( yˆ y) =. (a) Show that ( y y ˆ ) = ( y y ) b ( ) x x. (b) Show that ( yˆ y) = bˆ ( x x). (c) Utlzg the fact that ˆ SXY b =, show that SXX Ÿ = ( y yˆ ) ( yˆ y) =. 7. A statstcs professor has bee volved a collaboratve research project wth two etomologsts. The statstcs part of the project volves fttg regresso models to large data sets. Together they have wrtte ad submtted a mauscrpt to a etomology joural. The mauscrpt cotas a umber of scatter plots wth each showg a estmated regresso le (based o a vald model) ad

29 .8 Exercses 43 assocated dvdual 95% cofdece tervals for the regresso fucto at each x value, as well as the observed data. A referee has asked the followg questo: I do t uderstad how 95% of the observatos fall outsde the 95% CI as depcted the fgures. Brefly expla how t s etrely possble that 95% of the observatos fall outsde the 95% CI as depcted the fgures.

30

Simple Linear Regression

Simple Linear Regression Statstcal Methods I (EST 75) Page 139 Smple Lear Regresso Smple regresso applcatos are used to ft a model descrbg a lear relatoshp betwee two varables. The aspects of least squares regresso ad correlato

More information

Lecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model

Lecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model Lecture 7. Cofdece Itervals ad Hypothess Tests the Smple CLR Model I lecture 6 we troduced the Classcal Lear Regresso (CLR) model that s the radom expermet of whch the data Y,,, K, are the outcomes. The

More information

STA302/1001-Fall 2008 Midterm Test October 21, 2008

STA302/1001-Fall 2008 Midterm Test October 21, 2008 STA3/-Fall 8 Mdterm Test October, 8 Last Name: Frst Name: Studet Number: Erolled (Crcle oe) STA3 STA INSTRUCTIONS Tme allowed: hour 45 mutes Ads allowed: A o-programmable calculator A table of values from

More information

Objectives of Multiple Regression

Objectives of Multiple Regression Obectves of Multple Regresso Establsh the lear equato that best predcts values of a depedet varable Y usg more tha oe eplaator varable from a large set of potetal predctors {,,... k }. Fd that subset of

More information

Multiple Regression. More than 2 variables! Grade on Final. Multiple Regression 11/21/2012. Exam 2 Grades. Exam 2 Re-grades

Multiple Regression. More than 2 variables! Grade on Final. Multiple Regression 11/21/2012. Exam 2 Grades. Exam 2 Re-grades STAT 101 Dr. Kar Lock Morga 11/20/12 Exam 2 Grades Multple Regresso SECTIONS 9.2, 10.1, 10.2 Multple explaatory varables (10.1) Parttog varablty R 2, ANOVA (9.2) Codtos resdual plot (10.2) Trasformatos

More information

b. There appears to be a positive relationship between X and Y; that is, as X increases, so does Y.

b. There appears to be a positive relationship between X and Y; that is, as X increases, so does Y. .46. a. The frst varable (X) s the frst umber the par ad s plotted o the horzotal axs, whle the secod varable (Y) s the secod umber the par ad s plotted o the vertcal axs. The scatterplot s show the fgure

More information

ENGI 3423 Simple Linear Regression Page 12-01

ENGI 3423 Simple Linear Regression Page 12-01 ENGI 343 mple Lear Regresso Page - mple Lear Regresso ometmes a expermet s set up where the expermeter has cotrol over the values of oe or more varables X ad measures the resultg values of aother varable

More information

Lecture 8: Linear Regression

Lecture 8: Linear Regression Lecture 8: Lear egresso May 4, GENOME 56, Sprg Goals Develop basc cocepts of lear regresso from a probablstc framework Estmatg parameters ad hypothess testg wth lear models Lear regresso Su I Lee, CSE

More information

Multiple Linear Regression Analysis

Multiple Linear Regression Analysis LINEA EGESSION ANALYSIS MODULE III Lecture - 4 Multple Lear egresso Aalyss Dr. Shalabh Departmet of Mathematcs ad Statstcs Ida Isttute of Techology Kapur Cofdece terval estmato The cofdece tervals multple

More information

Chapter 13 Student Lecture Notes 13-1

Chapter 13 Student Lecture Notes 13-1 Chapter 3 Studet Lecture Notes 3- Basc Busess Statstcs (9 th Edto) Chapter 3 Smple Lear Regresso 4 Pretce-Hall, Ic. Chap 3- Chapter Topcs Types of Regresso Models Determg the Smple Lear Regresso Equato

More information

Statistics MINITAB - Lab 5

Statistics MINITAB - Lab 5 Statstcs 10010 MINITAB - Lab 5 PART I: The Correlato Coeffcet Qute ofte statstcs we are preseted wth data that suggests that a lear relatoshp exsts betwee two varables. For example the plot below s of

More information

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS Exam: ECON430 Statstcs Date of exam: Frday, December 8, 07 Grades are gve: Jauary 4, 08 Tme for exam: 0900 am 00 oo The problem set covers 5 pages Resources allowed:

More information

Econometric Methods. Review of Estimation

Econometric Methods. Review of Estimation Ecoometrc Methods Revew of Estmato Estmatg the populato mea Radom samplg Pot ad terval estmators Lear estmators Ubased estmators Lear Ubased Estmators (LUEs) Effcecy (mmum varace) ad Best Lear Ubased Estmators

More information

12.2 Estimating Model parameters Assumptions: ox and y are related according to the simple linear regression model

12.2 Estimating Model parameters Assumptions: ox and y are related according to the simple linear regression model 1. Estmatg Model parameters Assumptos: ox ad y are related accordg to the smple lear regresso model (The lear regresso model s the model that says that x ad y are related a lear fasho, but the observed

More information

residual. (Note that usually in descriptions of regression analysis, upper-case

residual. (Note that usually in descriptions of regression analysis, upper-case Regresso Aalyss Regresso aalyss fts or derves a model that descres the varato of a respose (or depedet ) varale as a fucto of oe or more predctor (or depedet ) varales. The geeral regresso model s oe of

More information

Summary of the lecture in Biostatistics

Summary of the lecture in Biostatistics Summary of the lecture Bostatstcs Probablty Desty Fucto For a cotuos radom varable, a probablty desty fucto s a fucto such that: 0 dx a b) b a dx A probablty desty fucto provdes a smple descrpto of the

More information

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. x, where. = y - ˆ " 1

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. x, where. = y - ˆ  1 STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS Recall Assumpto E(Y x) η 0 + η x (lear codtoal mea fucto) Data (x, y ), (x 2, y 2 ),, (x, y ) Least squares estmator ˆ E (Y x) ˆ " 0 + ˆ " x, where ˆ

More information

Simple Linear Regression

Simple Linear Regression Correlato ad Smple Lear Regresso Berl Che Departmet of Computer Scece & Iformato Egeerg Natoal Tawa Normal Uversty Referece:. W. Navd. Statstcs for Egeerg ad Scetsts. Chapter 7 (7.-7.3) & Teachg Materal

More information

Ordinary Least Squares Regression. Simple Regression. Algebra and Assumptions.

Ordinary Least Squares Regression. Simple Regression. Algebra and Assumptions. Ordary Least Squares egresso. Smple egresso. Algebra ad Assumptos. I ths part of the course we are gog to study a techque for aalysg the lear relatoshp betwee two varables Y ad X. We have pars of observatos

More information

Chapter Business Statistics: A First Course Fifth Edition. Learning Objectives. Correlation vs. Regression. In this chapter, you learn:

Chapter Business Statistics: A First Course Fifth Edition. Learning Objectives. Correlation vs. Regression. In this chapter, you learn: Chapter 3 3- Busess Statstcs: A Frst Course Ffth Edto Chapter 2 Correlato ad Smple Lear Regresso Busess Statstcs: A Frst Course, 5e 29 Pretce-Hall, Ic. Chap 2- Learg Objectves I ths chapter, you lear:

More information

Midterm Exam 1, section 1 (Solution) Thursday, February hour, 15 minutes

Midterm Exam 1, section 1 (Solution) Thursday, February hour, 15 minutes coometrcs, CON Sa Fracsco State Uversty Mchael Bar Sprg 5 Mdterm am, secto Soluto Thursday, February 6 hour, 5 mutes Name: Istructos. Ths s closed book, closed otes eam.. No calculators of ay kd are allowed..

More information

ESS Line Fitting

ESS Line Fitting ESS 5 014 17. Le Fttg A very commo problem data aalyss s lookg for relatoshpetwee dfferet parameters ad fttg les or surfaces to data. The smplest example s fttg a straght le ad we wll dscuss that here

More information

ECON 482 / WH Hong The Simple Regression Model 1. Definition of the Simple Regression Model

ECON 482 / WH Hong The Simple Regression Model 1. Definition of the Simple Regression Model ECON 48 / WH Hog The Smple Regresso Model. Defto of the Smple Regresso Model Smple Regresso Model Expla varable y terms of varable x y = β + β x+ u y : depedet varable, explaed varable, respose varable,

More information

Chapter 14 Logistic Regression Models

Chapter 14 Logistic Regression Models Chapter 4 Logstc Regresso Models I the lear regresso model X β + ε, there are two types of varables explaatory varables X, X,, X k ad study varable y These varables ca be measured o a cotuous scale as

More information

Probability and. Lecture 13: and Correlation

Probability and. Lecture 13: and Correlation 933 Probablty ad Statstcs for Software ad Kowledge Egeers Lecture 3: Smple Lear Regresso ad Correlato Mocha Soptkamo, Ph.D. Outle The Smple Lear Regresso Model (.) Fttg the Regresso Le (.) The Aalyss of

More information

Functions of Random Variables

Functions of Random Variables Fuctos of Radom Varables Chapter Fve Fuctos of Radom Varables 5. Itroducto A geeral egeerg aalyss model s show Fg. 5.. The model output (respose) cotas the performaces of a system or product, such as weght,

More information

Lecture Notes Types of economic variables

Lecture Notes Types of economic variables Lecture Notes 3 1. Types of ecoomc varables () Cotuous varable takes o a cotuum the sample space, such as all pots o a le or all real umbers Example: GDP, Polluto cocetrato, etc. () Dscrete varables fte

More information

Regresso What s a Model? 1. Ofte Descrbe Relatoshp betwee Varables 2. Types - Determstc Models (o radomess) - Probablstc Models (wth radomess) EPI 809/Sprg 2008 9 Determstc Models 1. Hypothesze

More information

Midterm Exam 1, section 2 (Solution) Thursday, February hour, 15 minutes

Midterm Exam 1, section 2 (Solution) Thursday, February hour, 15 minutes coometrcs, CON Sa Fracsco State Uverst Mchael Bar Sprg 5 Mdterm xam, secto Soluto Thursda, Februar 6 hour, 5 mutes Name: Istructos. Ths s closed book, closed otes exam.. No calculators of a kd are allowed..

More information

ENGI 4421 Joint Probability Distributions Page Joint Probability Distributions [Navidi sections 2.5 and 2.6; Devore sections

ENGI 4421 Joint Probability Distributions Page Joint Probability Distributions [Navidi sections 2.5 and 2.6; Devore sections ENGI 441 Jot Probablty Dstrbutos Page 7-01 Jot Probablty Dstrbutos [Navd sectos.5 ad.6; Devore sectos 5.1-5.] The jot probablty mass fucto of two dscrete radom quattes, s, P ad p x y x y The margal probablty

More information

Linear Regression with One Regressor

Linear Regression with One Regressor Lear Regresso wth Oe Regressor AIM QA.7. Expla how regresso aalyss ecoometrcs measures the relatoshp betwee depedet ad depedet varables. A regresso aalyss has the goal of measurg how chages oe varable,

More information

Multiple Choice Test. Chapter Adequacy of Models for Regression

Multiple Choice Test. Chapter Adequacy of Models for Regression Multple Choce Test Chapter 06.0 Adequac of Models for Regresso. For a lear regresso model to be cosdered adequate, the percetage of scaled resduals that eed to be the rage [-,] s greater tha or equal to

More information

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS Postpoed exam: ECON430 Statstcs Date of exam: Jauary 0, 0 Tme for exam: 09:00 a.m. :00 oo The problem set covers 5 pages Resources allowed: All wrtte ad prted

More information

Statistics: Unlocking the Power of Data Lock 5

Statistics: Unlocking the Power of Data Lock 5 STAT 0 Dr. Kar Lock Morga Exam 2 Grades: I- Class Multple Regresso SECTIONS 9.2, 0., 0.2 Multple explaatory varables (0.) Parttog varablty R 2, ANOVA (9.2) Codtos resdual plot (0.2) Exam 2 Re- grades Re-

More information

Example: Multiple linear regression. Least squares regression. Repetition: Simple linear regression. Tron Anders Moger

Example: Multiple linear regression. Least squares regression. Repetition: Simple linear regression. Tron Anders Moger Example: Multple lear regresso 5000,00 4000,00 Tro Aders Moger 0.0.007 brthweght 3000,00 000,00 000,00 0,00 50,00 00,00 50,00 00,00 50,00 weght pouds Repetto: Smple lear regresso We defe a model Y = β0

More information

STA 105-M BASIC STATISTICS (This is a multiple choice paper.)

STA 105-M BASIC STATISTICS (This is a multiple choice paper.) DCDM BUSINESS SCHOOL September Mock Eamatos STA 0-M BASIC STATISTICS (Ths s a multple choce paper.) Tme: hours 0 mutes INSTRUCTIONS TO CANDIDATES Do ot ope ths questo paper utl you have bee told to do

More information

Simple Linear Regression - Scalar Form

Simple Linear Regression - Scalar Form Smple Lear Regresso - Scalar Form Q.. Model Y X,..., p..a. Derve the ormal equatos that mmze Q. p..b. Solve for the ordary least squares estmators, p..c. Derve E, V, E, V, COV, p..d. Derve the mea ad varace

More information

Simple Linear Regression and Correlation. Applied Statistics and Probability for Engineers. Chapter 11 Simple Linear Regression and Correlation

Simple Linear Regression and Correlation. Applied Statistics and Probability for Engineers. Chapter 11 Simple Linear Regression and Correlation 4//6 Appled Statstcs ad Probablty for Egeers Sth Edto Douglas C. Motgomery George C. Ruger Chapter Smple Lear Regresso ad Correlato CHAPTER OUTLINE Smple Lear Regresso ad Correlato - Emprcal Models -8

More information

STA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #1

STA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #1 STA 08 Appled Lear Models: Regresso Aalyss Sprg 0 Soluto for Homework #. Let Y the dollar cost per year, X the umber of vsts per year. The the mathematcal relato betwee X ad Y s: Y 300 + X. Ths s a fuctoal

More information

Handout #8. X\Y f(x) 0 1/16 1/ / /16 3/ / /16 3/16 0 3/ /16 1/16 1/8 g(y) 1/16 1/4 3/8 1/4 1/16 1

Handout #8. X\Y f(x) 0 1/16 1/ / /16 3/ / /16 3/16 0 3/ /16 1/16 1/8 g(y) 1/16 1/4 3/8 1/4 1/16 1 Hadout #8 Ttle: Foudatos of Ecoometrcs Course: Eco 367 Fall/05 Istructor: Dr. I-Mg Chu Lear Regresso Model So far we have focused mostly o the study of a sgle radom varable, ts correspodg theoretcal dstrbuto,

More information

Chapter Two. An Introduction to Regression ( )

Chapter Two. An Introduction to Regression ( ) ubject: A Itroducto to Regresso Frst tage Chapter Two A Itroducto to Regresso (018-019) 1 pg. ubject: A Itroducto to Regresso Frst tage A Itroducto to Regresso Regresso aalss s a statstcal tool for the

More information

Chapter 13, Part A Analysis of Variance and Experimental Design. Introduction to Analysis of Variance. Introduction to Analysis of Variance

Chapter 13, Part A Analysis of Variance and Experimental Design. Introduction to Analysis of Variance. Introduction to Analysis of Variance Chapter, Part A Aalyss of Varace ad Epermetal Desg Itroducto to Aalyss of Varace Aalyss of Varace: Testg for the Equalty of Populato Meas Multple Comparso Procedures Itroducto to Aalyss of Varace Aalyss

More information

Mean is only appropriate for interval or ratio scales, not ordinal or nominal.

Mean is only appropriate for interval or ratio scales, not ordinal or nominal. Mea Same as ordary average Sum all the data values ad dvde by the sample sze. x = ( x + x +... + x Usg summato otato, we wrte ths as x = x = x = = ) x Mea s oly approprate for terval or rato scales, ot

More information

Chapter 8. Inferences about More Than Two Population Central Values

Chapter 8. Inferences about More Than Two Population Central Values Chapter 8. Ifereces about More Tha Two Populato Cetral Values Case tudy: Effect of Tmg of the Treatmet of Port-We tas wth Lasers ) To vestgate whether treatmet at a youg age would yeld better results tha

More information

Multivariate Transformation of Variables and Maximum Likelihood Estimation

Multivariate Transformation of Variables and Maximum Likelihood Estimation Marquette Uversty Multvarate Trasformato of Varables ad Maxmum Lkelhood Estmato Dael B. Rowe, Ph.D. Assocate Professor Departmet of Mathematcs, Statstcs, ad Computer Scece Copyrght 03 by Marquette Uversty

More information

Correlation and Simple Linear Regression

Correlation and Simple Linear Regression Correlato ad Smple Lear Regresso Berl Che Departmet of Computer Scece & Iformato Egeerg Natoal Tawa Normal Uverst Referece:. W. Navd. Statstcs for Egeerg ad Scetsts. Chapter 7 (7.-7.3) & Teachg Materal

More information

{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution:

{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution: Chapter 4 Exercses Samplg Theory Exercse (Smple radom samplg: Let there be two correlated radom varables X ad A sample of sze s draw from a populato by smple radom samplg wthout replacemet The observed

More information

Simple Linear Regression and Correlation.

Simple Linear Regression and Correlation. Smple Lear Regresso ad Correlato. Correspods to Chapter 0 Tamhae ad Dulop Sldes prepared b Elzabeth Newto (MIT) wth some sldes b Jacquele Telford (Johs Hopks Uverst) Smple lear regresso aalss estmates

More information

Lecture 3. Sampling, sampling distributions, and parameter estimation

Lecture 3. Sampling, sampling distributions, and parameter estimation Lecture 3 Samplg, samplg dstrbutos, ad parameter estmato Samplg Defto Populato s defed as the collecto of all the possble observatos of terest. The collecto of observatos we take from the populato s called

More information

The equation is sometimes presented in form Y = a + b x. This is reasonable, but it s not the notation we use.

The equation is sometimes presented in form Y = a + b x. This is reasonable, but it s not the notation we use. INTRODUCTORY NOTE ON LINEAR REGREION We have data of the form (x y ) (x y ) (x y ) These wll most ofte be preseted to us as two colum of a spreadsheet As the topc develops we wll see both upper case ad

More information

ENGI 4421 Propagation of Error Page 8-01

ENGI 4421 Propagation of Error Page 8-01 ENGI 441 Propagato of Error Page 8-01 Propagato of Error [Navd Chapter 3; ot Devore] Ay realstc measuremet procedure cotas error. Ay calculatos based o that measuremet wll therefore also cota a error.

More information

best estimate (mean) for X uncertainty or error in the measurement (systematic, random or statistical) best

best estimate (mean) for X uncertainty or error in the measurement (systematic, random or statistical) best Error Aalyss Preamble Wheever a measuremet s made, the result followg from that measuremet s always subject to ucertaty The ucertaty ca be reduced by makg several measuremets of the same quatty or by mprovg

More information

Reaction Time VS. Drug Percentage Subject Amount of Drug Times % Reaction Time in Seconds 1 Mary John Carl Sara William 5 4

Reaction Time VS. Drug Percentage Subject Amount of Drug Times % Reaction Time in Seconds 1 Mary John Carl Sara William 5 4 CHAPTER Smple Lear Regreo EXAMPLE A expermet volvg fve ubject coducted to determe the relatohp betwee the percetage of a certa drug the bloodtream ad the legth of tme t take the ubject to react to a tmulu.

More information

UNIVERSITY OF EAST ANGLIA. Main Series UG Examination

UNIVERSITY OF EAST ANGLIA. Main Series UG Examination UNIVERSITY OF EAST ANGLIA School of Ecoomcs Ma Seres UG Examato 03-4 INTRODUCTORY MATHEMATICS AND STATISTICS FOR ECONOMISTS ECO-400Y Tme allowed: 3 hours Aswer ALL questos from both Sectos. Aswer EACH

More information

Chapter 11 The Analysis of Variance

Chapter 11 The Analysis of Variance Chapter The Aalyss of Varace. Oe Factor Aalyss of Varace. Radomzed Bloc Desgs (ot for ths course) NIPRL . Oe Factor Aalyss of Varace.. Oe Factor Layouts (/4) Suppose that a expermeter s terested populatos

More information

( ) = ( ) ( ) Chapter 13 Asymptotic Theory and Stochastic Regressors. Stochastic regressors model

( ) = ( ) ( ) Chapter 13 Asymptotic Theory and Stochastic Regressors. Stochastic regressors model Chapter 3 Asmptotc Theor ad Stochastc Regressors The ature of eplaator varable s assumed to be o-stochastc or fed repeated samples a regresso aalss Such a assumpto s approprate for those epermets whch

More information

Investigation of Partially Conditional RP Model with Response Error. Ed Stanek

Investigation of Partially Conditional RP Model with Response Error. Ed Stanek Partally Codtoal Radom Permutato Model 7- vestgato of Partally Codtoal RP Model wth Respose Error TRODUCTO Ed Staek We explore the predctor that wll result a smple radom sample wth respose error whe a

More information

ECONOMETRIC THEORY. MODULE VIII Lecture - 26 Heteroskedasticity

ECONOMETRIC THEORY. MODULE VIII Lecture - 26 Heteroskedasticity ECONOMETRIC THEORY MODULE VIII Lecture - 6 Heteroskedastcty Dr. Shalabh Departmet of Mathematcs ad Statstcs Ida Isttute of Techology Kapur . Breusch Paga test Ths test ca be appled whe the replcated data

More information

Lecture 2: Linear Least Squares Regression

Lecture 2: Linear Least Squares Regression Lecture : Lear Least Squares Regresso Dave Armstrog UW Mlwaukee February 8, 016 Is the Relatoshp Lear? lbrary(car) data(davs) d 150) Davs$weght[d]

More information

hp calculators HP 30S Statistics Averages and Standard Deviations Average and Standard Deviation Practice Finding Averages and Standard Deviations

hp calculators HP 30S Statistics Averages and Standard Deviations Average and Standard Deviation Practice Finding Averages and Standard Deviations HP 30S Statstcs Averages ad Stadard Devatos Average ad Stadard Devato Practce Fdg Averages ad Stadard Devatos HP 30S Statstcs Averages ad Stadard Devatos Average ad stadard devato The HP 30S provdes several

More information

Lecture 1 Review of Fundamental Statistical Concepts

Lecture 1 Review of Fundamental Statistical Concepts Lecture Revew of Fudametal Statstcal Cocepts Measures of Cetral Tedecy ad Dsperso A word about otato for ths class: Idvduals a populato are desgated, where the dex rages from to N, ad N s the total umber

More information

1. The weight of six Golden Retrievers is 66, 61, 70, 67, 92 and 66 pounds. The weight of six Labrador Retrievers is 54, 60, 72, 78, 84 and 67.

1. The weight of six Golden Retrievers is 66, 61, 70, 67, 92 and 66 pounds. The weight of six Labrador Retrievers is 54, 60, 72, 78, 84 and 67. Ecoomcs 3 Itroducto to Ecoometrcs Sprg 004 Professor Dobk Name Studet ID Frst Mdterm Exam You must aswer all the questos. The exam s closed book ad closed otes. You may use your calculators but please

More information

University of Belgrade. Faculty of Mathematics. Master thesis Regression and Correlation

University of Belgrade. Faculty of Mathematics. Master thesis Regression and Correlation Uversty of Belgrade Vrtual Lbrary of Faculty of Mathematcs - Uversty of Belgrade Faculty of Mathematcs Master thess Regresso ad Correlato The caddate Supervsor Karma Ibrahm Soufya Vesa Jevremovć Jue 014

More information

Module 7: Probability and Statistics

Module 7: Probability and Statistics Lecture 4: Goodess of ft tests. Itroducto Module 7: Probablty ad Statstcs I the prevous two lectures, the cocepts, steps ad applcatos of Hypotheses testg were dscussed. Hypotheses testg may be used to

More information

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE THE ROYAL STATISTICAL SOCIETY 00 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE PAPER I STATISTICAL THEORY The Socety provdes these solutos to assst caddates preparg for the examatos future years ad for the

More information

CHAPTER 2. = y ˆ β x (.1022) So we can write

CHAPTER 2. = y ˆ β x (.1022) So we can write CHAPTER SOLUTIONS TO PROBLEMS. () Let y = GPA, x = ACT, ad = 8. The x = 5.875, y = 3.5, (x x )(y y ) = 5.85, ad (x x ) = 56.875. From equato (.9), we obta the slope as ˆβ = = 5.85/56.875., rouded to four

More information

Analysis of Variance with Weibull Data

Analysis of Variance with Weibull Data Aalyss of Varace wth Webull Data Lahaa Watthaacheewaul Abstract I statstcal data aalyss by aalyss of varace, the usual basc assumptos are that the model s addtve ad the errors are radomly, depedetly, ad

More information

Chapter 4 Multiple Random Variables

Chapter 4 Multiple Random Variables Revew for the prevous lecture: Theorems ad Examples: How to obta the pmf (pdf) of U = g (, Y) ad V = g (, Y) Chapter 4 Multple Radom Varables Chapter 44 Herarchcal Models ad Mxture Dstrbutos Examples:

More information

CLASS NOTES. for. PBAF 528: Quantitative Methods II SPRING Instructor: Jean Swanson. Daniel J. Evans School of Public Affairs

CLASS NOTES. for. PBAF 528: Quantitative Methods II SPRING Instructor: Jean Swanson. Daniel J. Evans School of Public Affairs CLASS NOTES for PBAF 58: Quattatve Methods II SPRING 005 Istructor: Jea Swaso Dael J. Evas School of Publc Affars Uversty of Washgto Ackowledgemet: The structor wshes to thak Rachel Klet, Assstat Professor,

More information

Simulation Output Analysis

Simulation Output Analysis Smulato Output Aalyss Summary Examples Parameter Estmato Sample Mea ad Varace Pot ad Iterval Estmato ermatg ad o-ermatg Smulato Mea Square Errors Example: Sgle Server Queueg System x(t) S 4 S 4 S 3 S 5

More information

Bootstrap Method for Testing of Equality of Several Coefficients of Variation

Bootstrap Method for Testing of Equality of Several Coefficients of Variation Cloud Publcatos Iteratoal Joural of Advaced Mathematcs ad Statstcs Volume, pp. -6, Artcle ID Sc- Research Artcle Ope Access Bootstrap Method for Testg of Equalty of Several Coeffcets of Varato Dr. Navee

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Marquette Uverst Maxmum Lkelhood Estmato Dael B. Rowe, Ph.D. Professor Departmet of Mathematcs, Statstcs, ad Computer Scece Coprght 08 b Marquette Uverst Maxmum Lkelhood Estmato We have bee sag that ~

More information

MEASURES OF DISPERSION

MEASURES OF DISPERSION MEASURES OF DISPERSION Measure of Cetral Tedecy: Measures of Cetral Tedecy ad Dsperso ) Mathematcal Average: a) Arthmetc mea (A.M.) b) Geometrc mea (G.M.) c) Harmoc mea (H.M.) ) Averages of Posto: a) Meda

More information

Discrete Mathematics and Probability Theory Fall 2016 Seshia and Walrand DIS 10b

Discrete Mathematics and Probability Theory Fall 2016 Seshia and Walrand DIS 10b CS 70 Dscrete Mathematcs ad Probablty Theory Fall 206 Sesha ad Walrad DIS 0b. Wll I Get My Package? Seaky delvery guy of some compay s out delverg packages to customers. Not oly does he had a radom package

More information

Fundamentals of Regression Analysis

Fundamentals of Regression Analysis Fdametals of Regresso Aalyss Regresso aalyss s cocered wth the stdy of the depedece of oe varable, the depedet varable, o oe or more other varables, the explaatory varables, wth a vew of estmatg ad/or

More information

X X X E[ ] E X E X. is the ()m n where the ( i,)th. j element is the mean of the ( i,)th., then

X X X E[ ] E X E X. is the ()m n where the ( i,)th. j element is the mean of the ( i,)th., then Secto 5 Vectors of Radom Varables Whe workg wth several radom varables,,..., to arrage them vector form x, t s ofte coveet We ca the make use of matrx algebra to help us orgaze ad mapulate large umbers

More information

Applied Statistics and Probability for Engineers, 5 th edition February 23, b) y ˆ = (85) =

Applied Statistics and Probability for Engineers, 5 th edition February 23, b) y ˆ = (85) = Appled Statstcs ad Probablty for Egeers, 5 th edto February 3, y.8.7.6.5.4.3.. -5 5 5 x b) y ˆ.3999 +.46(85).6836 c) y ˆ.3999 +.46(9).744 d) ˆ.46-3 a) Regresso Aalyss: Ratg Pots versus Meters per Att The

More information

Lecture 1: Introduction to Regression

Lecture 1: Introduction to Regression Lecture : Itroducto to Regresso A Eample: Eplag State Homcde Rates What kds of varables mght we use to epla/predct state homcde rates? Let s cosder just oe predctor for ow: povert Igore omtted varables,

More information

4. Standard Regression Model and Spatial Dependence Tests

4. Standard Regression Model and Spatial Dependence Tests 4. Stadard Regresso Model ad Spatal Depedece Tests Stadard regresso aalss fals the presece of spatal effects. I case of spatal depedeces ad/or spatal heterogeet a stadard regresso model wll be msspecfed.

More information

ε. Therefore, the estimate

ε. Therefore, the estimate Suggested Aswers, Problem Set 3 ECON 333 Da Hugerma. Ths s ot a very good dea. We kow from the secod FOC problem b) that ( ) SSE / = y x x = ( ) Whch ca be reduced to read y x x = ε x = ( ) The OLS model

More information

THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA

THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA THE ROYAL STATISTICAL SOCIETY EXAMINATIONS SOLUTIONS GRADUATE DIPLOMA PAPER II STATISTICAL THEORY & METHODS The Socety provdes these solutos to assst caddates preparg for the examatos future years ad for

More information

Statistics. Correlational. Dr. Ayman Eldeib. Simple Linear Regression and Correlation. SBE 304: Linear Regression & Correlation 1/3/2018

Statistics. Correlational. Dr. Ayman Eldeib. Simple Linear Regression and Correlation. SBE 304: Linear Regression & Correlation 1/3/2018 /3/08 Sstems & Bomedcal Egeerg Departmet SBE 304: Bo-Statstcs Smple Lear Regresso ad Correlato Dr. Ama Eldeb Fall 07 Descrptve Orgasg, summarsg & descrbg data Statstcs Correlatoal Relatoshps Iferetal Geeralsg

More information

Lecture 3 Probability review (cont d)

Lecture 3 Probability review (cont d) STATS 00: Itroducto to Statstcal Iferece Autum 06 Lecture 3 Probablty revew (cot d) 3. Jot dstrbutos If radom varables X,..., X k are depedet, the ther dstrbuto may be specfed by specfyg the dvdual dstrbuto

More information

Chapter 5 Properties of a Random Sample

Chapter 5 Properties of a Random Sample Lecture 6 o BST 63: Statstcal Theory I Ku Zhag, /0/008 Revew for the prevous lecture Cocepts: t-dstrbuto, F-dstrbuto Theorems: Dstrbutos of sample mea ad sample varace, relatoshp betwee sample mea ad sample

More information

CHAPTER VI Statistical Analysis of Experimental Data

CHAPTER VI Statistical Analysis of Experimental Data Chapter VI Statstcal Aalyss of Expermetal Data CHAPTER VI Statstcal Aalyss of Expermetal Data Measuremets do ot lead to a uque value. Ths s a result of the multtude of errors (maly radom errors) that ca

More information

TESTS BASED ON MAXIMUM LIKELIHOOD

TESTS BASED ON MAXIMUM LIKELIHOOD ESE 5 Toy E. Smth. The Basc Example. TESTS BASED ON MAXIMUM LIKELIHOOD To llustrate the propertes of maxmum lkelhood estmates ad tests, we cosder the smplest possble case of estmatg the mea of the ormal

More information

Chapter 2 Supplemental Text Material

Chapter 2 Supplemental Text Material -. Models for the Data ad the t-test Chapter upplemetal Text Materal The model preseted the text, equato (-3) s more properl called a meas model. ce the mea s a locato parameter, ths tpe of model s also

More information

Chapter 3 Sampling For Proportions and Percentages

Chapter 3 Sampling For Proportions and Percentages Chapter 3 Samplg For Proportos ad Percetages I may stuatos, the characterstc uder study o whch the observatos are collected are qualtatve ature For example, the resposes of customers may marketg surveys

More information

"It is the mark of a truly intelligent person to be moved by statistics." George Bernard Shaw

It is the mark of a truly intelligent person to be moved by statistics. George Bernard Shaw Chapter 0 Chapter 0 Lear Regresso ad Correlato "It s the mark of a truly tellget perso to be moved by statstcs." George Berard Shaw Source: https://www.google.com.ph/search?q=house+ad+car+pctures&bw=366&bh=667&tbm

More information

Continuous Distributions

Continuous Distributions 7//3 Cotuous Dstrbutos Radom Varables of the Cotuous Type Desty Curve Percet Desty fucto, f (x) A smooth curve that ft the dstrbuto 3 4 5 6 7 8 9 Test scores Desty Curve Percet Probablty Desty Fucto, f

More information

Lecture 1: Introduction to Regression

Lecture 1: Introduction to Regression Lecture : Itroducto to Regresso A Eample: Eplag State Homcde Rates What kds of varables mght we use to epla/predct state homcde rates? Let s cosder just oe predctor for ow: povert Igore omtted varables,

More information

: At least two means differ SST

: At least two means differ SST Formula Card for Eam 3 STA33 ANOVA F-Test: Completely Radomzed Desg ( total umber of observatos, k = Number of treatmets,& T = total for treatmet ) Step : Epress the Clam Step : The ypotheses: :... 0 A

More information

2.28 The Wall Street Journal is probably referring to the average number of cubes used per glass measured for some population that they have chosen.

2.28 The Wall Street Journal is probably referring to the average number of cubes used per glass measured for some population that they have chosen. .5 x 54.5 a. x 7. 786 7 b. The raked observatos are: 7.4, 7.5, 7.7, 7.8, 7.9, 8.0, 8.. Sce the sample sze 7 s odd, the meda s the (+)/ 4 th raked observato, or meda 7.8 c. The cosumer would more lkely

More information

Lecture Note to Rice Chapter 8

Lecture Note to Rice Chapter 8 ECON 430 HG revsed Nov 06 Lecture Note to Rce Chapter 8 Radom matrces Let Y, =,,, m, =,,, be radom varables (r.v. s). The matrx Y Y Y Y Y Y Y Y Y Y = m m m s called a radom matrx ( wth a ot m-dmesoal dstrbuto,

More information

REVIEW OF SIMPLE LINEAR REGRESSION SIMPLE LINEAR REGRESSION

REVIEW OF SIMPLE LINEAR REGRESSION SIMPLE LINEAR REGRESSION REVIEW OF SIMPLE LINEAR REGRESSION SIMPLE LINEAR REGRESSION I lear regreo, we coder the frequecy dtrbuto of oe varable (Y) at each of everal level of a ecod varable (X). Y kow a the depedet varable. The

More information

Chapter -2 Simple Random Sampling

Chapter -2 Simple Random Sampling Chapter - Smple Radom Samplg Smple radom samplg (SRS) s a method of selecto of a sample comprsg of umber of samplg uts out of the populato havg umber of samplg uts such that every samplg ut has a equal

More information

Logistic regression (continued)

Logistic regression (continued) STAT562 page 138 Logstc regresso (cotued) Suppose we ow cosder more complex models to descrbe the relatoshp betwee a categorcal respose varable (Y) that takes o two (2) possble outcomes ad a set of p explaatory

More information

Statistics Review Part 3. Hypothesis Tests, Regression

Statistics Review Part 3. Hypothesis Tests, Regression Statstcs Revew Part 3 Hypothess Tests, Regresso The Importace of Samplg Dstrbutos Why all the fuss about samplg dstrbutos? Because they are fudametal to hypothess testg. Remember that our goal s to lear

More information

Recall MLR 5 Homskedasticity error u has the same variance given any values of the explanatory variables Var(u x1,...,xk) = 2 or E(UU ) = 2 I

Recall MLR 5 Homskedasticity error u has the same variance given any values of the explanatory variables Var(u x1,...,xk) = 2 or E(UU ) = 2 I Chapter 8 Heterosedastcty Recall MLR 5 Homsedastcty error u has the same varace gve ay values of the eplaatory varables Varu,..., = or EUU = I Suppose other GM assumptos hold but have heterosedastcty.

More information

Based on Neter, Wasserman and Whitemore: Applied Statistics, Chapter 18, pp

Based on Neter, Wasserman and Whitemore: Applied Statistics, Chapter 18, pp SERIES V: REGRESSION ANALYSIS Based o Neter, Wasserma ad Whtemore: Appled Statstcs, Chapter 8, pp. 53-64. I. Itroducto assumptos of the model ad ts developmet The regresso aalyss aother mode of the aalyss

More information