UNIT 11 MULTIPLE LINEAR REGRESSION

Size: px
Start display at page:

Download "UNIT 11 MULTIPLE LINEAR REGRESSION"

Transcription

1 UNIT MULTIPLE LINEAR REGRESSION Structure. Itroductio release relies Obectives. Multiple Liear Regressio Model.3 Estimatio of Model Parameters Use of Matrix Notatio Properties of Least Squares Estimates.4 Test of Sigificace i Multiple Regressio.5 Coefficiet of Determiatio (R ) ad Adusted R.6 Regressio with Dummy Variables.7 Summary.8 Solutios/Aswers. INTRODUCTION I previous uits, we have discussed the liear relatioship betwee the depedet variable Y ad a idepedet variable X. The coefficiets a ad b were ukow ad for the give data o Y ad X, we have obtaied least squares estimates of parameters, i.e., â ad bˆ. We have also goe through the iferetial study to examie whether there exists a sigificat liear relatioship betwee Y ad X or ot. We have discussed the simple liear regressio model ad estimatio of model parameters, ad determied stadard errors. I this uit, we discuss the multiple liear regressio model alog with the estimatio of parameters i Secs.. ad.3. I multiple liear regressio, the basic cocept is the same as that of simple regressio. However, istead of oe idepedet variable, there are several idepedet variables, say, X, X, X 3,, X p. For example, the umber of uits sold by a car maufacturig compay per year may ot deped o oly oe idepedet variable such as price, but also o mileage per uit of fuel, appearace of the car, comfort level, durability ad moey spet o advertisig, etc. Here we may like to idetify the importat idepedet variables, which cotribute more to the variatio i the depedet variable(s). For this purpose, a mathematical relatioship betwee the depedet ad idepedet variables is established ad this relatio is further used for predictio purposes. We also discuss the iferetial study i multiple liear regressio i Sec..4. Sice the model may ivolve several idepedet variables affectig the depedet variable because of their relatioship via regressio, it may be of iterest to estimate their importace by estimatig regressio coefficiets alog with their stadard errors. The adequacy of model fit may be examied by overall fit of the model with the help of coefficiet of determiatio (R ). I this uit, we also discuss a method for calculatig R ad adusted R i Sec..5. The regressio aalysis with dummy variables is also discussed i Sec

2 Regressio Modellig I the ext uit, we shall discuss how to calculate the extra sum of squares explaied by the regressor variables o the respose variable. We shall also discuss the methods of selectio of importat regressor variables which play a importat role i selectio of the best fitted models. Obectives After studyig this uit, you should be able to: explai the cocept of multiple liear regressio; formulate a multiple liear regressio model; estimate the regressio coefficiets ad their stadard errors; calculate the coefficiet of determiatio (R ) ad adusted R ; ad predict the depedet variable for give values of idepedet variables.. MULTIPLE LINEAR REGRESSION MODEL I this sectio, we geeralise the simple regressio model cosidered i Uit 9. We have assumed i Uit 9 that (Y, X ), (Y, X ),, (Y, X ) are pairs of values. The equatio of the simple liear regressio model may be writte as Y = a + bx + e where e represets the error term, which arises from the differece of the observed Y ad the straight lie Y = a + bx. To miimise the term e, we use the method of least squares. From the above equatio, we may write a simple regressio model as Y i = a + bx i + e i i =,,, for the sample data of pairs give i terms of (Y i, X i ) (i=,, ). I agriculture, the crop yield depeds o more tha oe variable such as fertility of the soil, amout of raifall, amout of fertilisers, etc. A multiple regressio model that might describe this relatioship is Y = B + B X +B X + B 3 X 3 + e where Y deotes the yield, X deotes the fertility of soil, X deotes the raifall ad X 3 deotes the amout of fertilisers used. This is called the multiple liear regressio model with three idepedet/regressor variables. The term liear is used because the depedet/respose variable Y is a liear fuctio of the ukow parameters B, B, B ad B 3. I geeral, the respose variable may be related to p regressors or idepedet variables. Let Y be the depedet variable ad X, X,..., X p be p idepedet variables. The the multiple regressio model ca be writte as: Y B B X B X... B X e () p p 46 The parameters B, B,, B p are called the regressio coefficiets. The parameters B i (i =,,,, p) represet the expected chage i the respose variable Y per uit chage i X i whe the remaiig regressor variables are treated as costat. For the sake of simplicity, we shall attach a dummy variable X with the itercept B ; X takes value for all observatios. Now the model i equatio () ca be writte as: Y B X B X B X... B X e () p p

3 The simple regressio model cosidered i Uit 9 becomes a particular case of this model with X =, B = a, B = b ad B i =, (i ). The iterpretatio of coefficiets B ( =,,, p) is that B represets the amout of chage i Y for a uit chage i X, keepig the other idepedet variables X k (k ) fixed. These coefficiets are kow as partial regressio coefficiets as the effect of oe idepedet variable is studied o the depedet variable while the other variables are held fixed or costat. We use the term multiple liear regressio for this model because two or more tha two variables are icluded i the regressio aalysis ad the parameters B, B,, B p appear i a liear form. Moreover, the effect of these variables ca be studied oitly. Here X i ca be ay cotiuous fuctio such as log X, X, X 3, X, etc. However, it is ecessary that the equatio is liear. Let us cosider a polyomial model Multiple Liear Regressio Y p B BX BX... BpX e If we let X = X, X = X, X 3 = X 3 ad so o, the above model ca be writte i a liear form as give i equatio (). As i the case of simple liear regressio (Uit 9), here too we make the assumptios that e is ormally ad idepedetly distributed with mea zero ad costat variace σ..3 ESTIMATION OF MODEL PARAMETERS Recall that i Uit 9, we have estimated the parameters a ad b of a simple liear regressio equatio usig the method of least squares. I this method, we miimise the total error term, so that the sum of the squares of the differeces betwee the observed values ad their expected values is miimum, i.e., the sum of squares of the error terms is miimum. We also use the method of least squares to estimate the regressio coefficiets give i equatio (). Let the umber of observatios be (> p). Let y i deote the i th observed value ad x i deote the th observatio of the regressor variable X i. The data is represeted as give i the table below: pose Variable Y Regressor Variables X X.... X p y x x.... x p y x x.... x p y 3 x 3 x x p y x x.... x p The the multiple regressio model for the i th observatio Y i ca be writte as: Yi = B + BXi + BX i BpXpi + e i, i =,,..., where X i, X i,..., X pi are the correspodig values of p idepedet variables, B is the itercept, B, B,..., B p are p regressio coefficiets correspodig to idepedet variables X, X,, X p, respectively. 47

4 Regressio Modellig We ow miimise e i, the sum of squares of errors i the model give i equatio (): i i i i p pi (3) E e Y B X B X... B X i i with respect to B, B,..., B p to obtai their least squares estimates. For estimatig the model parameters B, B, B,, B p, we differetiate E with respect to B, B, B,, B p, respectively, ad equate the result to zero. If we differetiate E with respect to B, we obtai the th ( =,,, p) ormal equatios as follows: E B Yi BXi BXi... BpXpi X i,,,,..., p i Simplifyig equatio (4), we obtai the least squares ormal equatios: B Xi B Xi... Bp Xpi Yi i i i i Xi B Xi B XiX i... Bp XiX pi XiYi i i i i i B B B Xi B XiXi B Xi... Bp XiXpi X iyi i i i i i (4)..... (5) B Xpi B XpiXi B X pix i... Bp Xpi XpiYi i i i i i These are p + ormal equatios ad ca be solved usig the methods of solvig simultaeous liear equatios. The solutios of the above ormal equatios called the least squares estimates are B ˆ, B ˆ ˆ ˆ, B,..., B p,respectively. For simplicity, we shall rewrite the model i equatio () by cetralisig the idepedet variables X, X,..,X p, i.e., by takig differeces from their meas: Y B B X... B X B X X... B X X e p p p p p i B X B (X X )... B (X X ) e p p p 48 where B B BX... BpXp. Here X, X,..., Xp are the meas of p idepedet/ regressor variables. With this, the ormal equatio becomes i E Yi B Xi B Xi X Bp Xpi X p (X i X ), B (6) Note that X i = for all i. The coefficiets B, B,..., B p remai the same, but the itercept chages from B to B. Oce we have obtaied the estimates of B, B, B,, B p, we ca obtai ˆB from the followig equatio: Bˆ = Bˆ - Bˆ X Bˆ X (7) ' p p Let us cosider a applicatio of these results.

5 Example : A statistical aalyst is aalysig the vedig machie routes i the distributio system. He/she is iterested i predictig the amout of time required by the route driver to service the vedig machies i a outlet. The compay maager resposible for the study has suggested that the two most importat variables affectig the delivery time Y (i miutes), are (i) the umber of cases (X ) ad (ii) the distace travelled (i m) by the route driver (X ). The delivery time data collected by the statistical aalyst is give below: Multiple Liear Regressio Time (Y) No. of Cases (X ) Distace (X ) Check whether there is a liear relatioship betwee Y (Time) ad the two idepedet variables X (umber of cases) ad X (distace). Calculate the values of the regressio coefficiets ad fit the regressio equatio. Solutio: To fid the values of regressio coefficiets ad fit the regressio equatio for the give data, we form the followig table: Time (Y) No. of Cases (X ) Distace (X ) Y (X ) (X ) X Y X Y X X Y i = X i = X i =35 Y i =38 X i =95 X i =5 Xi Yi =85 XiYi =68 XiXi =35 49

6 Regressio Modellig O puttig the values from the above table i the ormal equatios (5) for p =, ad otig that X =, we get B ˆ B ˆ X B ˆ X Y i i i B ˆ X B ˆ X B ˆ X X Y X i i i i i i B ˆ X B ˆ X X B ˆ X YX i i i i i i O puttig the values calculated i the table i the above equatios, we get Bˆ Bˆ 35 Bˆ (i) Bˆ 95 Bˆ 35 Bˆ 85 (ii) 35 Bˆ 35 Bˆ 5 Bˆ 68 (iii) From equatio (i), we have ˆB Bˆ ˆ 35 B (iv) O puttig the value of ˆB i equatios (ii) ad (iii) ad simplifyig, we get 4 Bˆ 8 Bˆ (v) 8 Bˆ 587 Bˆ 6 O solvig equatios (v) ad (vi), we get ad Bˆ.3 Bˆ.356 Hece, the fitted equatio is ˆB.8765 Y X.356 X (vi) So we ca coclude that there is a liear relatioship betwee Y (time i secods) ad the two idepedet variables X (umber of cases) ad X (distace). As the regressio coefficiets for both variables are positive, these affect the delivery time. The umerical value of the regressio coefficiet ˆB associated with X is higher tha the value of ˆB associated with X. It shows that the umber of cases affects the delivery time more tha the distace travelled..3. Use of Matrix Notatio Whe p is greater tha, it is more coveiet to write the ormal equatios i matrix form. The regressio equatios i matrix otatio ca be writte as: 5 Y = X B + e (8)

7 where Multiple Liear Regressio y X. Xp B e X. Xp B y e Y.., X (p)...., Bp ad e y B X. X p p e I geeral, Y is a vector of the observed values of the respose variable Y, X is a (p+) matrix of the values of regressor variables, B is a (p+) vector of regressio coefficiets ad e is a vector of radom errors. I matrix otatio, the (p+) ormal equatios ca be writte as follows: X XBˆ XY (9a) Equatio (9a) represets the ormal least squares equatios. For the sake of simplicity, we may write them as xi. xpi B y i x i x. B i xix pi xi yi B pi pi i pi p x x x. x x piyi To solve the ormal least squares equatios give i equatio (9a), we multiply both sides by the iverse of X X. Thus, the estimates of the regressio coefficiets are give by XX X Y Bˆ (9b) () O puttig the values of the estimates i equatio (), we get the fitted regressio model correspodig to the observatios of the regressor variables X, X,, X p as Yˆ Bˆ Bˆ X Bˆ X... Bˆ X e () p p The matrix represetatio of the fitted values correspodig to the observed values are similar to the equatio (9a) ad are give as Yˆ X Bˆ X X X XY () The differece betwee the observed value y i ad the correspodig estimated value ŷ i is called the i th residual r i, i.e., Here, we shall use the followig otatio: Y Y.,. Y X X X. Xp Xp..... X(p) X X X. Xp Xp Note that X i s are all uity ad other variables are cetralised (deviatios from mea). The (k+) ormal equatios ca be writte as where ˆB' = X XBˆ = X Y (B ˆ ', ˆB, -----, ˆB ) I case X X is o-sigular, i.e. ( X X ) is of rak (p+), the least squares estimates of B, deoted by ˆB, ca be writte as - ˆB = (X X) X Y r y ŷ i i i i,,3,...,. The residuals may be writte i matrix otatio as r Y Ŷ Y X Bˆ (3) 5

8 Regressio Modellig.3. Properties of Least Squares Estimates We ow describe the statistical properties of least squares estimates. Whe X, ( =,,..., p) are liearly related, ( X X ) is ot ivertible. I this case we caot obtai uique estimates of B. We shall ot cosider this case ay more. It is to be oted that Bˆ is a ubiased estimate of B because E(B) ˆ (X X) X E(Y) (X X) X E(X B e) (X X) (X X)B = B sice E(e ) = ad (X X) (X X) I This shows that ˆB is ubiased. The variace of Y (which is actually the variace-covariace matrix as Y is a vector) is give as ( ) V Y = s I where I is a idetity matrix of order. The variace-covariace matrix of Bˆ is give by ˆ V B X X X V Y X X X XX X I X XX where (X X) (4) X X... X X X X X X X Xi XiX i Xi XiX X 'X Xpi XiX pi XiXpi Xpi i i pi i i i i i pi pi ad s X X. k i ki i Here V( Bˆ )= σ (X X) is a (p+) (p+) matrix ad its diagoal elemets give the variaces of coefficiets ad off diagoal elemets give the covariaces. If we use the otatio we ca write - ˆ ( ) ( ) V B = s X 'X = ( s ),,k =,,..., p k k k V(B ˆ ) = s, ad Cov(B ˆ,B ˆ ) = s (5) 5 The stadard error of Bˆ is give by S. E. ( ) ˆB = s (6)

9 The residual sum of squares SS is obtaied by substitutig the least squares estimates of B, B,, B p i equatio (3): SS Y Bˆ X Bˆ X... Bˆ X i oi i p pi i This is the sum of squares ot accouted for by the regressio model. I matrix otatio, this ca be writte as SS YY YX Bˆ Y B ˆ (Y'X ) B ˆ (Y 'X ) B ˆ (Y'X )... B ˆ (Y'X ) i p p (7) Note that X, X,, X p are deviatios from respective meas. As we have fitted (p +) parameters, the degree of freedom of residual sum of squares is ( p ). A ubiased estimate of σ is obtaied by dividig the residual sum of squares, i.e., SS, by its degree of freedom ( p ). Thus Multiple Liear Regressio ˆ SS /( p ) (8) If we are iterested i predictig the mea value of Y for a give set of idepedet variables X,, X p, the we use the fitted model. The predicted mea value of Y for give X,, X p is give by Y ˆ B ˆ B ˆ X... B ˆ X p p Let us explai the matrix method with the help of a example. Example : Usig the data of Example, fid the estimate of regressio coefficiets ad SS by usig the matrix method. Also predict the expected time Y at X = 7, X =. Solutio: Usig the matrix otatio we have from the data: ad Y = [,,, 5, 5,,, 5, 3, 5,, ] 35 X' X 95 35, X' Y X ' X ' ˆB ˆB X 'X X 'Y ˆB

10 Regressio Modellig Hece the fitted equatio is Y X.356 X Now, we calculate the value of residual sum of squares to obtai a estimate of ˆ as follows: SS YY YXBˆ = 38 (.8765) 85 (.3) 68 (.356) = = 97.5 Therefore, o puttig the value of ˆ 97.5/( 3).87 SS i equatio (8), we get Usig the above results ad puttig the values X = 7 ad X = i the fitted equatio for multiple regressio, we get Ŷ X.356 X Ŷ As far as the iterpretatio of coefficiets is cocered, there is a icrease of.3 secods i time for oe uit icrease i X. Similarly, for oe uit icrease i X there is a icrease of.356 secods i time. You may like to pause here ad solve the followig exercises to check your uderstadig. E) I a study of firms, the depedet variable was the total delivery time (Y) ad the idepedet variables were the distace covered (X ) ad the packagig time (X ). The delivery time data collected by the statistical aalyst is give below: 54 Time (Y) Distace (X ) Packagig Time (X ) Y = 66 i X = 747 i X = 6 i Estimate the parameters B, B, ad B by solvig ormal equatios ad fid the estimated multiple liear regressio equatio. E) Use the matrix method to estimate parameters from the data give i E).

11 .4 TEST OF SIGNIFICANCE IN MULTIPLE REGRESSION Multiple Liear Regressio So far you have leart how to estimate the parameters ad fit the multiple regressio model. You may ow like to test the adequacy of the fitted model ad examie whether the idepedet variables cotribute sigificatly i explaiig the variability i Y or ot. For this purpose, we use the test of sigificace of equality of variaces of the regressor variables. If there is a liear relatioship betwee the respose variable Y ad ay of the idepedet variables X, X,, X p, we use the test of sigificace of regressio. The test of sigificace of regressio is a test to determie the liear relatioship betwee the respose variable ad regressor variables ad is ofte used to examie the adequacy of the model. I order to test whether the cotributio of idepedet variables X,,X p is sigificat or ot, we test whether B, B,, B p are all zero i the model or at least oe of them is ot zero. This hypothesis ca be writte as: H : B B... Bp H : At least oe of the regressio coefficiets is ot zero It ca be tested by cosiderig the followig F-ratio: SS p F Reg SS p (9) I this test, the total sum of squares SS T is partitioed ito a sum of squares due to the cotributio of regressor variables ( SS ) ad a residual sum of squares ( SS ). From equatio (7), the residual sum of squares ( SS ) is: SS Y B ˆ (Y'X ) B ˆ (Y'X )... B ˆ (Y'X ) i p p or SS YY Y'XBˆ () If B, B,, B k are all zeros, i.e., idepedet variables do ot cotribute to the variability i Y, the the total sum of squares, deoted by SS T, is give as: Y i SST Yi Y YY () This is the total variability preset i Y aroud the mea Y. We ca rewrite equatio () as Reg Y i Yi i SS Y BY 'X that is, SS SST SSReg Hece, the differece of SST SS gives the cotributio of idepedet variables X, X,, X p, i explaiig the variability i Y, i.e., 55

12 Regressio Modellig SS SS B ˆ (Y'X ) B ˆ (YX )... B ˆ (YX ) Y T p p Y i SS ˆ Reg Y'X B or SS Bˆ Y 'X Y Reg We ow summarise these results i the followig ANOVA Table: () ANOVA TABLE Sources of Variatio Degree of Freedom (d.f.) Sum of Squares (S.S.) Mea Sum of Squares Variace Ratio Idepedet Variables (X, X,, X p ) p p SS Bˆ Y 'X Y Reg SS Reg SS F Reg SS p p p iduals ( SS ) p SS Y 'Y Bˆ Y 'X p SS p Total Y' Y Y Uder the ull hypothesis, i.e., whe B = B = = B p =, F is distributed as Fisher s F-distributio with p ad (p) degree of freedom, i.e., F ~ F p,( p) (3) If the calculated F is less tha the tabulated F p, (p) at α level of sigificace, the we coclude that the cotributio of X, X,..., X p to the variability i Y is ot sigificat. Thus, they have o cotributio i predictio. It may be of further iterest to examie whether ay oe coefficiet (say B ) correspodig to the idepedet variable X is differet from zero, after accoutig for other variables X k (all k ). This ca be tested by cosiderig the statistic t: Bˆ t (4) S.E.(Bˆ ) where S.E.( Bˆ ) uses the estimated value of ˆ give i equatio (8). Uder the ull hypothesis, i.e., B =, the proposed statistic t follows the Studet s t-distributio with (p) d.f. Thus, if t t (5) /, p we accept H. Otherwise, we reect it. If B is sigificatly differet from zero, it cotributes sigificatly to the variability i Y after takig ito accout the cotributio of other variables. If B is ot sigificatly differet from zero, its cotributio is ot sigificat after accoutig for other variables i the model. 56 Example 4: Usig the data of Example ad the results of Example, costruct the ANOVA table, apply a relevat test of hypothesis ad iterpret the results.

13 Solutio: As per the data give i Example ad the results of Example, we have SS = ad p ˆB Y'X = Usig these values, we costruct the ANOVA table as follows: ANOVA TABLE Multiple Liear Regressio Sources of Variatio Degree of Freedom (d.f.) Sum of Squares (S.S.) Mea Sum of Squares Variace Ratio Idepedet Variables (X, X ) iduals ( SS ) p SS Bˆ Y 'X Y Reg = p SS Y'Y Bˆ Y'X = SS Reg = SS p =.87 F SSReg p = SS p =7.5 ( - - ) Total Y' Y Y = We have obtaied the Variace Ratio F = 7., whereas the tabulated value of F, 9 at α =.5 is 4.6. Hece, we reect H ad coclude that X ad X cotribute sigificatly to the variability. It may be of further iterest to examie whether the coefficiet B correspodig to idepedet variable X is differet from zero, after accoutig for other variables X k (all k ). This ca be tested by cosiderig the statistic t: Bˆ t S.E.(Bˆ ) From the result of Example, we also have ˆB.3 ad ˆB.356 The Variace-Covariace matrix is ˆ ˆ XX V B Thus ˆ V B Usig equatio (5), we obtai V(B ˆ ) 7.7, V(B ˆ ˆ ).4 ad V(B ).4 ad therefore, ˆ S.E. B =

14 Regressio Modellig ˆ S.E. B ˆ S.E. B = Therefore, the statistic t is give as: ˆB.8765 t.6758 S.E.(B ˆ.7769 ) ˆB.3 t 4.64 S.E.(B ˆ.399 ) ˆB.356 t.7444 S.E.(B ˆ.494 ) But the tabulated value of t-statistic for α =.5 is t.5,.6 Hece, both variables cotribute sigificatly to the variability i Y. You may ow like to solve the followig exercise. E3) Make the ANOVA table, calculate stadard errors of estimates ad test their sigificace usig the data i E. Iterpret the results..5 COEFFICIENT OF DETERMINATION (R ) AND ADJUSTED R We defie the coefficiet of determiatio, R, i the same way as for simple regressio. It gives a measure of adequacy of model fit. We defie R as follows: R = Variability accouted by idepedet variables/total variability aroud the mea 58 p ˆB Y'X Y'Y Y Y (6) Its value always lies betwee ad. Whe the fit is good, R ~. Otherwise, R ~. The value of R always icreases with p. The icrease may be egligible, but R ever decreases. Whe we compare two models with differet values of p, the model with larger p is preferable if R correspodig to it is sigificatly larger tha R with smaller p. A model with smaller p with large R is always preferable as it is a simple model. Hece, you should choose a model with small p if its R is ot much smaller tha R for a model with a larger p. For this, we defie a adusted R, viz., R Ad, which pealises R whe p icreases but R does ot icrease sigificatly. We kow that

15 R R SSReg (7) SS T SS SS T Reg = SS T SS SS T Multiple Liear Regressio The we defie R Ad as SS /( p ) ( )( R ) Ad SS T /( ) ( p ) R (8) Here, we have divided the umerator ad deomiator by their degree of freedom. SS /( p ) may decrease with icrease i p eve whe there is o appreciable decrease i R. Hece, R Ad ( )( R ) (9) ( p ) Therefore, we should stop icludig the terms i the model if decreasig. We prefer a model with larger with smaller R Ad but larger p. R Ad starts R Ad ad smaller p tha a model Example 5: Usig the data of Example ad the results of Examples ad 3, calculate R, R ad iterpret the results. Ad Solutio: Usig the data of Example ad the results of Examples ad 3, ad o puttig the values i equatio (7), we get SS SS Reg R.797 T Therefore, the adusted R is obtaied as follows: ( )(.797) R Ad.7454 ( ) From the coefficiet of determiatio, R, we see that 79% variability i Y is due to X. This is quite a good fit. Adusted R is.7454, which is quite large. Hece we coclude that both X ad X cotribute adequately to the model fit. You may ow like to calculate R ad adusted R yourself. Try the followig exercise. E4) Calculate R ad adusted R ad commet o the goodess of fit of the model, for the data give i E..6 REGRESSION WITH DUMMY VARIABLES I previous sectios, we have dealt with multiple liear regressio whe the idepedet / regressor variables are quatitative. The quatitative variables such as height, distace, temperature, time, icome, pressure, etc. have a well 59

16 Regressio Modellig defied scale of measuremet. However, sometimes idepedet variables iclude qualitative variables such as sex (male/female), regios (orth, south, east, west, etc.), religio such as Hidu, Muslim, Christia, etc. Such variables called categorical variables caot be measured ad hece o quatitative umber ca be assiged to them. We defie dummy variables to accout for the effect that the qualitative variables may have o the respose variable. Dummy variables are also kow as idicator variables. Suppose, k is the umber of levels a categorical variable takes. The we defie (k ) dummy variables. For example, if we have two categories of male or female i the data, i.e., k = ad we defie oe dummy variable. Suppose that a statistical aalyst is aalysig the vedig machie s efficiecy i the distributio of a product. She/he is iterested i relatig the time required to service the cosumer with the distace travelled by the product i the vedig machie for machies of two types, A ad B. The secod regressor variable, machie type is qualitative, ad has two levels: Type A ad Type B. It allows us to code the types of machies used. Therefore, we defie a dummy variable X which takes o the values ad to idetify the types of machies as follows: X, if distributio is doe by machie A, if distributio is doe by machie B The variable X is called a idicator variable because it is used to idicate the presece or absece of Machie A or B. For such situatios, we have a multiple liear regressio model give by Y B BX BX e (3) To determie the regressio coefficiets i this model, we first cosider machie type A for which X takes value. The the regressio model is give by: Y B B X B e Y B BX e (3) The relatioship betwee the respose variable Y ad regressor variable X, i.e., distace travelled by the product i the machie is a straight lie with itercept B ad slope B. For machie of type B, we have X =. The the regressio model becomes Y B B X B e Y B B B X e (3) which shows that the relatioship betwee Y ad X is also a straight lie with slope B but itercept (B + B ). 6 Note that these models are liear with the same slope B but differet itercepts. Hece, these two models describe two parallel regressio lies, i.e., two lies with a commo slope ad differet itercepts. The vertical distace betwee these two lies is the differece i the itercepts, i.e., B. The two parallel regressio lies formed by the above models give i equatios (3) ad (3) are show Fig...

17 Multiple Liear Regressio Fig.. For three Machie types A, B ad C, two dummy variables X ad X 3 are used. The model becomes The levels of dummy variable would be: Y B BX BX B3X3 e (33) X = X 3 = X = X 3 = X = X 3 = For Machie Type A For Machie Type B For Machie Type C I geeral, a categorical variable with k categories is deoted by (k ) dummy variables. Let us try to uderstad regressio aalysis usig dummy variables with the help of a example. Example 6: A statistical aalyst is aalysig the performace of washig machies i the distributio system. He/she is iterested i predictig the amout of time required by the driver to service washig machies of two types: i) Type A ad ii) Type B. The data o the required time collected by the statistical aalyst is give below: Time (Y) Distace (X ) Machie Type (X )

18 Regressio Modellig Check whether there is a liear relatioship betwee Y (time) ad the two idepedet variables X (distace) ad X ( type). Calculate the values of the coefficiets ad fit the regressio equatio. Solutio: Sice two types of washig machies A ad B have bee used, k =. Here we have to defie oe dummy variable X, which takes two values: X = if the observatio is from machie A = if the observatio is from machie B We form the followig table from the give data to fit the regressio equatio: Time (Y) Distace (X ) Machie Type (X ) Y (X ) (X ) X Y X Y X X Y i = X i =35 X i = 5 Y = i 38 X i = 5 X i =5 X Y i i = 68 X Y i i = 8 X X i i = The ormal equatios (5) for p = ad X i = are: B ˆ B ˆ X B ˆ X Y ' i i i B ˆ X B ˆ X B ˆ X X YX i i i i i i B ˆ X B ˆ X X B ˆ X Y X i i i i i i O puttig the values of the sums calculated i the above table, we get Bˆ 35 Bˆ 5Bˆ (i) 35 Bˆ 5 Bˆ Bˆ 68 (ii) 5 Bˆ Bˆ 5Bˆ 8 (iii) 6 From equatio (iii), we have Bˆ 6 44 Bˆ Bˆ (iv)

19 O puttig the value of ˆB i equatios (i) ad (ii) ad simplifyig, we get Multiple Liear Regressio 78Bˆ 7 Bˆ 8 (v) 3 Bˆ 3 Bˆ O solvig equatios (v) ad (vi), we get Bˆ =.3498, Bˆ =-.38 Bˆ 6 44 Bˆ Bˆ.646 ad Hece, the fitted regressio equatio is (vi) Y X.38 X (vii) We coclude that there is a liear relatioship betwee Y (time i secods) ad the two idepedet variables X (distace) ad X (type of machie). Sice the regressio coefficiet for the variable X is egative, it affects the delivery time. The umerical value of the regressio coefficiet associated with X is higher tha that of the other regressor variable. It shows that distace travelled (i m) affects the delivery time less tha the type of machies. To determie the regressio coefficiets i this model for each type of machie, we first cosider machie A for which X takes value. We put the values of regressio coefficiets i equatio (8). The the regressio model becomes Y X (viii) For machie B, we put the value of the regressio coefficiet ad X =. The the regressio model becomes Y X (ix) Note that as discussed i Sec.5, these estimated regressio lies have the same slope, i.e.,.3498, but have differet itercepts, i.e.,.646 ad.66. You may ow like to solve the followig problem to check your uderstadig: E5) Usig the data give i the followig table, fid the regressio coefficiets ad obtai the estimated regressio equatios for the model give i equatios (7), (8) ad (9) : Time (hour) Y Distace (feet) X Machie Type X 8 6 A 4 95 A 7 7 A 4 84 A 3 98 A 4 53 B 3 68 B 54 B 89 B 9 73 B 63

20 Regressio Modellig Check whether there is a liear relatioship betwee Y (time) ad the two idepedet variables X (distace) ad X (machie type). Calculate the values of the coefficiets ad fit the regressio equatio. We ow summarise the cocepts that we have discussed i this uit..7 SUMMARY. The basic cocept of multiple liear regressio is the same as that of simple regressio. However, istead of oe idepedet variable, there are several idepedet variables, say, X, X, X 3,, X p.. A multiple regressio model is give by Y = B + B X +B X + + B p X p + e where Y is the depedet variable ad X, X,, X p are p idepedet variables. This is called the multiple liear regressio model with p idepedet/regressor variables. The term liear is used because the depedet/respose variable Y is a liear fuctio of the ukow parameters B, B, B,, B p. 3. The simple regressio model cosidered i Uit 9 becomes a particular case of this model with X =, B = a, B = b ad B i = (i ). The iterpretatio of coefficiets B ( =,,, p) is that B represets the amout of chage i Y for a uit chage i X, keepig the other idepedet variables X k s (k ) fixed. These coefficiets are kow as partial regressio coefficiets as the effect of oe idepedet variable is studied o the depedet variable while the other variables are held fixed or costat. We use the term multiple liear regressio for this model because several variables are icluded i the regressio ad the parameters B, B,, B p appear i a liear form. 4. We estimate the parameters of a multiple liear regressio equatio usig the method of least squares. I this method, we miimise the total error term, so that the sum of the squares of the differeces betwee the observed values Y i ad its expected values is miimum, i.e., the sum of squares of the error terms is miimum. Whe p is greater tha, it is more coveiet to write the ormal equatios i matrix form. The regressio equatios i matrix otatio ca be writte as Y = X B + e, where Y is a vector of the observed values of the respose variable Y, X is a (p + ) matrix of the values of regressor variables, B is a (p + ) vector of regressio coefficiets ad e is a vector of radom errors. I matrix otatio, the (p + ) ormal equatios ca be writte as X XBˆ XY 5. The variace-covariace matrix of Bˆ is give by V( Bˆ ) (X X) where V( Bˆ )= σ (X X) is a (p + ) (p + ) matrix ad its diagoal elemets give the variaces of coefficiets ad off diagoal elemets give the covariaces. If we use the otatio 64 V Bˆ X' X ( ),, k,,...,p. k

21 we ca write V(Bˆ ), ad, k ) Cov(Bˆ Bˆ k Multiple Liear Regressio The stadard error of Bˆ is give by Bˆ S.E. 6. If there is a liear relatioship betwee the respose variable Y ad ay of the idepedet variables X, X,, X p, we use the test of sigificace of regressio. The test of sigificace of regressio is a test to determie the liear relatioship betwee the respose variable ad regressor variables ad is ofte used to examie the adequacy of the model. 7. The coefficiet of determiatio, R ad adusted R are measures of goodess of fit of the multiple regressio model. The value of R always icreases with p. The icrease may be egligible, but R ever decreases. Whe we compare two models with differet values of p, the model with larger p is preferable if R correspodig to it is sigificatly larger tha R with smaller p. A model with smaller p with large R is always preferable as it is a simple model. Hece, oe should choose a model with small p if its R is ot much smaller tha R for a model with a larger p. 8. We defie dummy variables to accout for the effect that qualitative variables may have o the respose variable. Dummy variables are also kow as categorical as idicator variables. Suppose, k represets the umber of levels a categorical variable takes, the we defie (k ) dummy variables. For example, if we have two categories, male or female, i the data, k = ad we defie oe dummy variable..8 SOLUTIONS/ANSWERS E) We do the followig calculatios for the give data: Time (Y) Distace (X ) Vedig Time (X ) Y (X ) (X ) X Y X Y X X Y i = 66 X i = 747 X i = 6 Y i = 98 X i = 5889 X i = 75 X Y i i = 9 XiYi = 4535 X X i i = 86 65

22 Regressio Modellig From the above table, puttig the values of Yi, X, i X, i X, i X, i Xi Yi, XiY ad i XiX i ormal equatios, we get i Bˆ 747 Bˆ 6Bˆ 66 (i) 747 Bˆ 5889 Bˆ 86 Bˆ 9 (ii) 6 Bˆ 86 Bˆ 75 Bˆ 4535 (iii) Solvig these equatios, we get ˆB , Bˆ ˆ.79 ad B.6 The fitted regressio equatio is: Y = X +.6 X E) Usig the matrix otatio, we have from the data: Y [8, 4, 7, 4, 3, 4, 3,,, 9] X'X , X'Y X ' X ˆB ˆB X 'X X 'Y.79 Bˆ.6 ad Hece, the fitted equatio is Y = X +.6X We ow calculate the value of residual sum of squares to obtai a estimate of ˆ as follows: SS YY YXBˆ = (6.7569) 9 (.79) 4535 (.6) = = 4.45 Therefore, o puttig the value of ˆ 4.45/( 3) 6.6 SS i equatio (8), we get 66 E3) Usig the data of E ad the results of E, we get ˆB , Bˆ ˆ.79 ad B.6

23 As per the data give i Example ad the result of Example, we have Multiple Liear Regressio SS 4.45 ad p Bˆ Y' X = Usig these values we costruct the ANOVA table as follows: ANOVA TABLE Sources of Variatio Degree of Freedom Sum of Squares (S.S.) Mea Sum of Squares Variace Ratio (d.f.) Idepedet Variables (X, X ) iduals ( SS ) p å SS = Bˆ Y 'X - Y Reg = = p SS = Y 'Y - Bˆ å Y 'X = = 4.45 SS Reg = SS ( - p - ) = 6.6 F SSReg p = SS p = 9.74 ( - - ) Total 9 Y' Y Y = 5.4 The calculated value of Variace Ratio F = 9.74, whereas the tabulated value of F, at α =.5 is Hece, we reect H ad coclude that X ad X cotribute sigificatly i explaiig the variability. It may be of further iterest to examie whether the coefficiet B, correspodig to idepedet variable X, is differet from zero, after accoutig for other variables X k (all k ). This ca be tested by cosiderig statistic t: ˆB t = S.E.(B ˆ ) From the result of Example, we have Bˆ.79 ad Bˆ.6 The Variace-Covariace matrix is ˆ ˆ X X V B Thus V(B) ˆ Usig equatio (5), we obtai V(B ˆ ) , V(B ˆ ˆ ).4 ad V(B ).3 ad therefore, 67

24 Regressio Modellig ˆ S.E. B = ˆ S.E. B.4.64 ˆ S.E. B = Therefore, the statistic t is give as: ˆB t 3.65 S.E.(B ˆ 7.56 ) ˆB.79 t.74 S.E.(B ˆ.64 ) ˆB.6 t.94 S.E.(B ˆ.489 ) But the tabulated value of t-statistic for α =.5 is t.5,7.37 Hece, variable X cotributes sigificatly i explaiig the variability i Y but the variable X does ot. As far as the iterpretatio of coefficiets is cocered, there is a icrease of.6 secods i time for oe uit chage i cases (X ). Similarly, for oe uit icrease i X, there is a.79 secods decrease i time. E4) Usig the data of E) ad the results of E) ad E3), we get R = Sum of Squares due to X, X /Total Sum of Squares = 9.58/5.4 =.79 R = - Ad ( - )( - R ) ( - p - ) ( ) = - = R idicates that oly 7% of variability i Y is explaied by X ad X. E5) Two types of washig machies A ad B have bee used. Hece, k =. Here we have to defie oe dummy variable X, which takes two values: 68 X = if the observatio is from machie type A = if the observatio is from machie type B

25 From the give data, we form the followig table to fid ad fit the regressio equatio: Multiple Liear Regressio Time Y Distace (X ) (X ) Y (X ) (X ) X Y X Y X X Y i =66 X i =747 X i = 5 Y = i 98 X i = 5889 X i =5 X Y i i = 9 X Y i i = 9 X X i i =337 From the above table, puttig the values i the ormal equatios (5) for p = ad otig that X =, we get Bˆ 747 Bˆ 5Bˆ 66 (i) 747 Bˆ 5889 Bˆ 337 Bˆ 9 (ii) 5 Bˆ 337 Bˆ 5Bˆ 9 (iii) From equatio (iii), we have ˆB Bˆ 5Bˆ 5 (iv) O puttig the value of ˆB i equatios (i) ad (ii) ad simplifyig, we get 365Bˆ 5 Bˆ 7 (v) 396 Bˆ 5 Bˆ 735 O solvig equatios (v) ad (vi), we get Bˆ = -.4, Bˆ = (.4) 5(.344) ad ˆB (vi) Hece the fitted equatio for the model give i equatio (7) is Y X.344 X (vii) 69

26 Regressio Modellig Now we ca coclude that there is a liear relatioship betwee Y (time) ad the two idepedet variables X (distace) ad X (type of machies). As the regressio coefficiet for the variable X is egative, it affects the delivery time. To determie whether the regressio coefficiets i this model are correct, we first cosider machie A for which X takes value. We put the values of regressio coefficiets i equatio (8). The the regressio model becomes Y X (viii) For machie B, we put the value of regressio coefficiets ad X =. The the regressio model becomes Y 3.6.4X (ix) Note that as discussed i Sec.5, these estimated regressio lies have the same slope, i.e.,.4, but differet itercepts, i.e., ad

1 Inferential Methods for Correlation and Regression Analysis

1 Inferential Methods for Correlation and Regression Analysis 1 Iferetial Methods for Correlatio ad Regressio Aalysis I the chapter o Correlatio ad Regressio Aalysis tools for describig bivariate cotiuous data were itroduced. The sample Pearso Correlatio Coefficiet

More information

Properties and Hypothesis Testing

Properties and Hypothesis Testing Chapter 3 Properties ad Hypothesis Testig 3.1 Types of data The regressio techiques developed i previous chapters ca be applied to three differet kids of data. 1. Cross-sectioal data. 2. Time series data.

More information

Correlation Regression

Correlation Regression Correlatio Regressio While correlatio methods measure the stregth of a liear relatioship betwee two variables, we might wish to go a little further: How much does oe variable chage for a give chage i aother

More information

11 Correlation and Regression

11 Correlation and Regression 11 Correlatio Regressio 11.1 Multivariate Data Ofte we look at data where several variables are recorded for the same idividuals or samplig uits. For example, at a coastal weather statio, we might record

More information

First, note that the LS residuals are orthogonal to the regressors. X Xb X y = 0 ( normal equations ; (k 1) ) So,

First, note that the LS residuals are orthogonal to the regressors. X Xb X y = 0 ( normal equations ; (k 1) ) So, 0 2. OLS Part II The OLS residuals are orthogoal to the regressors. If the model icludes a itercept, the orthogoality of the residuals ad regressors gives rise to three results, which have limited practical

More information

ECON 3150/4150, Spring term Lecture 3

ECON 3150/4150, Spring term Lecture 3 Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step ECON 3150/4150, Sprig term 2014. Lecture 3 Ragar Nymoe Uiversity of Oslo 21 Jauary 2014 1 / 30 Itroductio

More information

Response Variable denoted by y it is the variable that is to be predicted measure of the outcome of an experiment also called the dependent variable

Response Variable denoted by y it is the variable that is to be predicted measure of the outcome of an experiment also called the dependent variable Statistics Chapter 4 Correlatio ad Regressio If we have two (or more) variables we are usually iterested i the relatioship betwee the variables. Associatio betwee Variables Two variables are associated

More information

Linear Regression Demystified

Linear Regression Demystified Liear Regressio Demystified Liear regressio is a importat subject i statistics. I elemetary statistics courses, formulae related to liear regressio are ofte stated without derivatio. This ote iteds to

More information

Chapter 13, Part A Analysis of Variance and Experimental Design

Chapter 13, Part A Analysis of Variance and Experimental Design Slides Prepared by JOHN S. LOUCKS St. Edward s Uiversity Slide 1 Chapter 13, Part A Aalysis of Variace ad Eperimetal Desig Itroductio to Aalysis of Variace Aalysis of Variace: Testig for the Equality of

More information

3/3/2014. CDS M Phil Econometrics. Types of Relationships. Types of Relationships. Types of Relationships. Vijayamohanan Pillai N.

3/3/2014. CDS M Phil Econometrics. Types of Relationships. Types of Relationships. Types of Relationships. Vijayamohanan Pillai N. 3/3/04 CDS M Phil Old Least Squares (OLS) Vijayamohaa Pillai N CDS M Phil Vijayamoha CDS M Phil Vijayamoha Types of Relatioships Oly oe idepedet variable, Relatioship betwee ad is Liear relatioships Curviliear

More information

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d Liear regressio Daiel Hsu (COMS 477) Maximum likelihood estimatio Oe of the simplest liear regressio models is the followig: (X, Y ),..., (X, Y ), (X, Y ) are iid radom pairs takig values i R d R, ad Y

More information

Matrix Representation of Data in Experiment

Matrix Representation of Data in Experiment Matrix Represetatio of Data i Experimet Cosider a very simple model for resposes y ij : y ij i ij, i 1,; j 1,,..., (ote that for simplicity we are assumig the two () groups are of equal sample size ) Y

More information

Algebra of Least Squares

Algebra of Least Squares October 19, 2018 Algebra of Least Squares Geometry of Least Squares Recall that out data is like a table [Y X] where Y collects observatios o the depedet variable Y ad X collects observatios o the k-dimesioal

More information

Simple Linear Regression

Simple Linear Regression Chapter 2 Simple Liear Regressio 2.1 Simple liear model The simple liear regressio model shows how oe kow depedet variable is determied by a sigle explaatory variable (regressor). Is is writte as: Y i

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2016 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

Chapters 5 and 13: REGRESSION AND CORRELATION. Univariate data: x, Bivariate data (x,y).

Chapters 5 and 13: REGRESSION AND CORRELATION. Univariate data: x, Bivariate data (x,y). Chapters 5 ad 13: REGREION AND CORRELATION (ectios 5.5 ad 13.5 are omitted) Uivariate data: x, Bivariate data (x,y). Example: x: umber of years studets studied paish y: score o a proficiecy test For each

More information

(all terms are scalars).the minimization is clearer in sum notation:

(all terms are scalars).the minimization is clearer in sum notation: 7 Multiple liear regressio: with predictors) Depedet data set: y i i = 1, oe predictad, predictors x i,k i = 1,, k = 1, ' The forecast equatio is ŷ i = b + Use matrix otatio: k =1 b k x ik Y = y 1 y 1

More information

Dr. Maddah ENMG 617 EM Statistics 11/26/12. Multiple Regression (2) (Chapter 15, Hines)

Dr. Maddah ENMG 617 EM Statistics 11/26/12. Multiple Regression (2) (Chapter 15, Hines) Dr Maddah NMG 617 M Statistics 11/6/1 Multiple egressio () (Chapter 15, Hies) Test for sigificace of regressio This is a test to determie whether there is a liear relatioship betwee the depedet variable

More information

Open book and notes. 120 minutes. Cover page and six pages of exam. No calculators.

Open book and notes. 120 minutes. Cover page and six pages of exam. No calculators. IE 330 Seat # Ope book ad otes 120 miutes Cover page ad six pages of exam No calculators Score Fial Exam (example) Schmeiser Ope book ad otes No calculator 120 miutes 1 True or false (for each, 2 poits

More information

Simple Linear Regression

Simple Linear Regression Simple Liear Regressio 1. Model ad Parameter Estimatio (a) Suppose our data cosist of a collectio of pairs (x i, y i ), where x i is a observed value of variable X ad y i is the correspodig observatio

More information

Lecture 11 Simple Linear Regression

Lecture 11 Simple Linear Regression Lecture 11 Simple Liear Regressio Fall 2013 Prof. Yao Xie, yao.xie@isye.gatech.edu H. Milto Stewart School of Idustrial Systems & Egieerig Georgia Tech Midterm 2 mea: 91.2 media: 93.75 std: 6.5 2 Meddicorp

More information

SIMPLE LINEAR REGRESSION AND CORRELATION ANALYSIS

SIMPLE LINEAR REGRESSION AND CORRELATION ANALYSIS SIMPLE LINEAR REGRESSION AND CORRELATION ANALSIS INTRODUCTION There are lot of statistical ivestigatio to kow whether there is a relatioship amog variables Two aalyses: (1) regressio aalysis; () correlatio

More information

Summary: CORRELATION & LINEAR REGRESSION. GC. Students are advised to refer to lecture notes for the GC operations to obtain scatter diagram.

Summary: CORRELATION & LINEAR REGRESSION. GC. Students are advised to refer to lecture notes for the GC operations to obtain scatter diagram. Key Cocepts: 1) Sketchig of scatter diagram The scatter diagram of bivariate (i.e. cotaiig two variables) data ca be easily obtaied usig GC. Studets are advised to refer to lecture otes for the GC operatios

More information

Geometry of LS. LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT

Geometry of LS. LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT OCTOBER 7, 2016 LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT Geometry of LS We ca thik of y ad the colums of X as members of the -dimesioal Euclidea space R Oe ca

More information

S Y Y = ΣY 2 n. Using the above expressions, the correlation coefficient is. r = SXX S Y Y

S Y Y = ΣY 2 n. Using the above expressions, the correlation coefficient is. r = SXX S Y Y 1 Sociology 405/805 Revised February 4, 004 Summary of Formulae for Bivariate Regressio ad Correlatio Let X be a idepedet variable ad Y a depedet variable, with observatios for each of the values of these

More information

Linear Regression Models

Linear Regression Models Liear Regressio Models Dr. Joh Mellor-Crummey Departmet of Computer Sciece Rice Uiversity johmc@cs.rice.edu COMP 528 Lecture 9 15 February 2005 Goals for Today Uderstad how to Use scatter diagrams to ispect

More information

Expectation and Variance of a random variable

Expectation and Variance of a random variable Chapter 11 Expectatio ad Variace of a radom variable The aim of this lecture is to defie ad itroduce mathematical Expectatio ad variace of a fuctio of discrete & cotiuous radom variables ad the distributio

More information

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering CEE 5 Autum 005 Ucertaity Cocepts for Geotechical Egieerig Basic Termiology Set A set is a collectio of (mutually exclusive) objects or evets. The sample space is the (collectively exhaustive) collectio

More information

Solutions to Odd Numbered End of Chapter Exercises: Chapter 4

Solutions to Odd Numbered End of Chapter Exercises: Chapter 4 Itroductio to Ecoometrics (3 rd Updated Editio) by James H. Stock ad Mark W. Watso Solutios to Odd Numbered Ed of Chapter Exercises: Chapter 4 (This versio July 2, 24) Stock/Watso - Itroductio to Ecoometrics

More information

Final Examination Solutions 17/6/2010

Final Examination Solutions 17/6/2010 The Islamic Uiversity of Gaza Faculty of Commerce epartmet of Ecoomics ad Political Scieces A Itroductio to Statistics Course (ECOE 30) Sprig Semester 009-00 Fial Eamiatio Solutios 7/6/00 Name: I: Istructor:

More information

Statistical Properties of OLS estimators

Statistical Properties of OLS estimators 1 Statistical Properties of OLS estimators Liear Model: Y i = β 0 + β 1 X i + u i OLS estimators: β 0 = Y β 1X β 1 = Best Liear Ubiased Estimator (BLUE) Liear Estimator: β 0 ad β 1 are liear fuctio of

More information

Statistics 511 Additional Materials

Statistics 511 Additional Materials Cofidece Itervals o mu Statistics 511 Additioal Materials This topic officially moves us from probability to statistics. We begi to discuss makig ifereces about the populatio. Oe way to differetiate probability

More information

Lecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise)

Lecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise) Lecture 22: Review for Exam 2 Basic Model Assumptios (without Gaussia Noise) We model oe cotiuous respose variable Y, as a liear fuctio of p umerical predictors, plus oise: Y = β 0 + β X +... β p X p +

More information

Random Variables, Sampling and Estimation

Random Variables, Sampling and Estimation Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig

More information

Regression, Inference, and Model Building

Regression, Inference, and Model Building Regressio, Iferece, ad Model Buildig Scatter Plots ad Correlatio Correlatio coefficiet, r -1 r 1 If r is positive, the the scatter plot has a positive slope ad variables are said to have a positive relatioship

More information

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1. Eco 325/327 Notes o Sample Mea, Sample Proportio, Cetral Limit Theorem, Chi-square Distributio, Studet s t distributio 1 Sample Mea By Hiro Kasahara We cosider a radom sample from a populatio. Defiitio

More information

t distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference

t distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference EXST30 Backgroud material Page From the textbook The Statistical Sleuth Mea [0]: I your text the word mea deotes a populatio mea (µ) while the work average deotes a sample average ( ). Variace [0]: The

More information

ST 305: Exam 3 ( ) = P(A)P(B A) ( ) = P(A) + P(B) ( ) = 1 P( A) ( ) = P(A) P(B) σ X 2 = σ a+bx. σ ˆp. σ X +Y. σ X Y. σ X. σ Y. σ n.

ST 305: Exam 3 ( ) = P(A)P(B A) ( ) = P(A) + P(B) ( ) = 1 P( A) ( ) = P(A) P(B) σ X 2 = σ a+bx. σ ˆp. σ X +Y. σ X Y. σ X. σ Y. σ n. ST 305: Exam 3 By hadig i this completed exam, I state that I have either give or received assistace from aother perso durig the exam period. I have used o resources other tha the exam itself ad the basic

More information

Regression and Correlation

Regression and Correlation 43 Cotets Regressio ad Correlatio 43.1 Regressio 43. Correlatio 17 Learig outcomes You will lear how to explore relatioships betwee variables ad how to measure the stregth of such relatioships. You should

More information

II. Descriptive Statistics D. Linear Correlation and Regression. 1. Linear Correlation

II. Descriptive Statistics D. Linear Correlation and Regression. 1. Linear Correlation II. Descriptive Statistics D. Liear Correlatio ad Regressio I this sectio Liear Correlatio Cause ad Effect Liear Regressio 1. Liear Correlatio Quatifyig Liear Correlatio The Pearso product-momet correlatio

More information

STP 226 ELEMENTARY STATISTICS

STP 226 ELEMENTARY STATISTICS TP 6 TP 6 ELEMENTARY TATITIC CHAPTER 4 DECRIPTIVE MEAURE IN REGREION AND CORRELATION Liear Regressio ad correlatio allows us to examie the relatioship betwee two or more quatitative variables. 4.1 Liear

More information

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions Chapter 9 Slide Ifereces from Two Samples 9- Overview 9- Ifereces about Two Proportios 9- Ifereces about Two Meas: Idepedet Samples 9-4 Ifereces about Matched Pairs 9-5 Comparig Variatio i Two Samples

More information

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA, 016 MODULE : Statistical Iferece Time allowed: Three hours Cadidates should aswer FIVE questios. All questios carry equal marks. The umber

More information

Regression and correlation

Regression and correlation Cotets 43 Regressio ad correlatio 1. Regressio. Correlatio Learig outcomes You will lear how to explore relatioships betwee variables ad how to measure the stregth of such relatioships. You should ote

More information

UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL/MAY 2009 EXAMINATIONS ECO220Y1Y PART 1 OF 2 SOLUTIONS

UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL/MAY 2009 EXAMINATIONS ECO220Y1Y PART 1 OF 2 SOLUTIONS PART of UNIVERSITY OF TORONTO Faculty of Arts ad Sciece APRIL/MAY 009 EAMINATIONS ECO0YY PART OF () The sample media is greater tha the sample mea whe there is. (B) () A radom variable is ormally distributed

More information

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals 7-1 Chapter 4 Part I. Samplig Distributios ad Cofidece Itervals 1 7- Sectio 1. Samplig Distributio 7-3 Usig Statistics Statistical Iferece: Predict ad forecast values of populatio parameters... Test hypotheses

More information

Correlation and Regression

Correlation and Regression Correlatio ad Regressio Lecturer, Departmet of Agroomy Sher-e-Bagla Agricultural Uiversity Correlatio Whe there is a relatioship betwee quatitative measures betwee two sets of pheomea, the appropriate

More information

Introduction to Econometrics (3 rd Updated Edition) Solutions to Odd- Numbered End- of- Chapter Exercises: Chapter 4

Introduction to Econometrics (3 rd Updated Edition) Solutions to Odd- Numbered End- of- Chapter Exercises: Chapter 4 Itroductio to Ecoometrics (3 rd Updated Editio) by James H. Stock ad Mark W. Watso Solutios to Odd- Numbered Ed- of- Chapter Exercises: Chapter 4 (This versio August 7, 204) 205 Pearso Educatio, Ic. Stock/Watso

More information

Because it tests for differences between multiple pairs of means in one test, it is called an omnibus test.

Because it tests for differences between multiple pairs of means in one test, it is called an omnibus test. Math 308 Sprig 018 Classes 19 ad 0: Aalysis of Variace (ANOVA) Page 1 of 6 Itroductio ANOVA is a statistical procedure for determiig whether three or more sample meas were draw from populatios with equal

More information

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 9 Multicolliearity Dr Shalabh Departmet of Mathematics ad Statistics Idia Istitute of Techology Kapur Multicolliearity diagostics A importat questio that

More information

Common Large/Small Sample Tests 1/55

Common Large/Small Sample Tests 1/55 Commo Large/Small Sample Tests 1/55 Test of Hypothesis for the Mea (σ Kow) Covert sample result ( x) to a z value Hypothesis Tests for µ Cosider the test H :μ = μ H 1 :μ > μ σ Kow (Assume the populatio

More information

The Method of Least Squares. To understand least squares fitting of data.

The Method of Least Squares. To understand least squares fitting of data. The Method of Least Squares KEY WORDS Curve fittig, least square GOAL To uderstad least squares fittig of data To uderstad the least squares solutio of icosistet systems of liear equatios 1 Motivatio Curve

More information

Statistics 203 Introduction to Regression and Analysis of Variance Assignment #1 Solutions January 20, 2005

Statistics 203 Introduction to Regression and Analysis of Variance Assignment #1 Solutions January 20, 2005 Statistics 203 Itroductio to Regressio ad Aalysis of Variace Assigmet #1 Solutios Jauary 20, 2005 Q. 1) (MP 2.7) (a) Let x deote the hydrocarbo percetage, ad let y deote the oxyge purity. The simple liear

More information

Worksheet 23 ( ) Introduction to Simple Linear Regression (continued)

Worksheet 23 ( ) Introduction to Simple Linear Regression (continued) Worksheet 3 ( 11.5-11.8) Itroductio to Simple Liear Regressio (cotiued) This worksheet is a cotiuatio of Discussio Sheet 3; please complete that discussio sheet first if you have ot already doe so. This

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

3. Z Transform. Recall that the Fourier transform (FT) of a DT signal xn [ ] is ( ) [ ] = In order for the FT to exist in the finite magnitude sense,

3. Z Transform. Recall that the Fourier transform (FT) of a DT signal xn [ ] is ( ) [ ] = In order for the FT to exist in the finite magnitude sense, 3. Z Trasform Referece: Etire Chapter 3 of text. Recall that the Fourier trasform (FT) of a DT sigal x [ ] is ω ( ) [ ] X e = j jω k = xe I order for the FT to exist i the fiite magitude sese, S = x [

More information

Slide Set 13 Linear Model with Endogenous Regressors and the GMM estimator

Slide Set 13 Linear Model with Endogenous Regressors and the GMM estimator Slide Set 13 Liear Model with Edogeous Regressors ad the GMM estimator Pietro Coretto pcoretto@uisa.it Ecoometrics Master i Ecoomics ad Fiace (MEF) Uiversità degli Studi di Napoli Federico II Versio: Friday

More information

[ ] ( ) ( ) [ ] ( ) 1 [ ] [ ] Sums of Random Variables Y = a 1 X 1 + a 2 X 2 + +a n X n The expected value of Y is:

[ ] ( ) ( ) [ ] ( ) 1 [ ] [ ] Sums of Random Variables Y = a 1 X 1 + a 2 X 2 + +a n X n The expected value of Y is: PROBABILITY FUNCTIONS A radom variable X has a probabilit associated with each of its possible values. The probabilit is termed a discrete probabilit if X ca assume ol discrete values, or X = x, x, x 3,,

More information

[412] A TEST FOR HOMOGENEITY OF THE MARGINAL DISTRIBUTIONS IN A TWO-WAY CLASSIFICATION

[412] A TEST FOR HOMOGENEITY OF THE MARGINAL DISTRIBUTIONS IN A TWO-WAY CLASSIFICATION [412] A TEST FOR HOMOGENEITY OF THE MARGINAL DISTRIBUTIONS IN A TWO-WAY CLASSIFICATION BY ALAN STUART Divisio of Research Techiques, Lodo School of Ecoomics 1. INTRODUCTION There are several circumstaces

More information

10. Comparative Tests among Spatial Regression Models. Here we revisit the example in Section 8.1 of estimating the mean of a normal random

10. Comparative Tests among Spatial Regression Models. Here we revisit the example in Section 8.1 of estimating the mean of a normal random Part III. Areal Data Aalysis 0. Comparative Tests amog Spatial Regressio Models While the otio of relative likelihood values for differet models is somewhat difficult to iterpret directly (as metioed above),

More information

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE IN STATISTICS, 017 MODULE 4 : Liear models Time allowed: Oe ad a half hours Cadidates should aswer THREE questios. Each questio carries

More information

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample. Statistical Iferece (Chapter 10) Statistical iferece = lear about a populatio based o the iformatio provided by a sample. Populatio: The set of all values of a radom variable X of iterest. Characterized

More information

Statistical inference: example 1. Inferential Statistics

Statistical inference: example 1. Inferential Statistics Statistical iferece: example 1 Iferetial Statistics POPULATION SAMPLE A clothig store chai regularly buys from a supplier large quatities of a certai piece of clothig. Each item ca be classified either

More information

SEQUENCES AND SERIES

SEQUENCES AND SERIES 9 SEQUENCES AND SERIES INTRODUCTION Sequeces have may importat applicatios i several spheres of huma activities Whe a collectio of objects is arraged i a defiite order such that it has a idetified first

More information

ECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors

ECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors ECONOMETRIC THEORY MODULE XIII Lecture - 34 Asymptotic Theory ad Stochastic Regressors Dr. Shalabh Departmet of Mathematics ad Statistics Idia Istitute of Techology Kapur Asymptotic theory The asymptotic

More information

MBACATÓLICA. Quantitative Methods. Faculdade de Ciências Económicas e Empresariais UNIVERSIDADE CATÓLICA PORTUGUESA 9. SAMPLING DISTRIBUTIONS

MBACATÓLICA. Quantitative Methods. Faculdade de Ciências Económicas e Empresariais UNIVERSIDADE CATÓLICA PORTUGUESA 9. SAMPLING DISTRIBUTIONS MBACATÓLICA Quatitative Methods Miguel Gouveia Mauel Leite Moteiro Faculdade de Ciêcias Ecoómicas e Empresariais UNIVERSIDADE CATÓLICA PORTUGUESA 9. SAMPLING DISTRIBUTIONS MBACatólica 006/07 Métodos Quatitativos

More information

Lecture 7: Properties of Random Samples

Lecture 7: Properties of Random Samples Lecture 7: Properties of Radom Samples 1 Cotiued From Last Class Theorem 1.1. Let X 1, X,...X be a radom sample from a populatio with mea µ ad variace σ

More information

Continuous Data that can take on any real number (time/length) based on sample data. Categorical data can only be named or categorised

Continuous Data that can take on any real number (time/length) based on sample data. Categorical data can only be named or categorised Questio 1. (Topics 1-3) A populatio cosists of all the members of a group about which you wat to draw a coclusio (Greek letters (μ, σ, Ν) are used) A sample is the portio of the populatio selected for

More information

Math 155 (Lecture 3)

Math 155 (Lecture 3) Math 55 (Lecture 3) September 8, I this lecture, we ll cosider the aswer to oe of the most basic coutig problems i combiatorics Questio How may ways are there to choose a -elemet subset of the set {,,,

More information

ECON 3150/4150, Spring term Lecture 1

ECON 3150/4150, Spring term Lecture 1 ECON 3150/4150, Sprig term 2013. Lecture 1 Ragar Nymoe Uiversity of Oslo 15 Jauary 2013 1 / 42 Refereces to Lecture 1 ad 2 Hill, Griffiths ad Lim, 4 ed (HGL) Ch 1-1.5; Ch 2.8-2.9,4.3-4.3.1.3 Bårdse ad

More information

The standard deviation of the mean

The standard deviation of the mean Physics 6C Fall 20 The stadard deviatio of the mea These otes provide some clarificatio o the distictio betwee the stadard deviatio ad the stadard deviatio of the mea.. The sample mea ad variace Cosider

More information

Simple Regression. Acknowledgement. These slides are based on presentations created and copyrighted by Prof. Daniel Menasce (GMU) CS 700

Simple Regression. Acknowledgement. These slides are based on presentations created and copyrighted by Prof. Daniel Menasce (GMU) CS 700 Simple Regressio CS 7 Ackowledgemet These slides are based o presetatios created ad copyrighted by Prof. Daiel Measce (GMU) Basics Purpose of regressio aalysis: predict the value of a depedet or respose

More information

9. Simple linear regression G2.1) Show that the vector of residuals e = Y Ŷ has the covariance matrix (I X(X T X) 1 X T )σ 2.

9. Simple linear regression G2.1) Show that the vector of residuals e = Y Ŷ has the covariance matrix (I X(X T X) 1 X T )σ 2. LINKÖPINGS UNIVERSITET Matematiska Istitutioe Matematisk Statistik HT1-2015 TAMS24 9. Simple liear regressio G2.1) Show that the vector of residuals e = Y Ŷ has the covariace matrix (I X(X T X) 1 X T )σ

More information

Comparison of Minimum Initial Capital with Investment and Non-investment Discrete Time Surplus Processes

Comparison of Minimum Initial Capital with Investment and Non-investment Discrete Time Surplus Processes The 22 d Aual Meetig i Mathematics (AMM 207) Departmet of Mathematics, Faculty of Sciece Chiag Mai Uiversity, Chiag Mai, Thailad Compariso of Miimum Iitial Capital with Ivestmet ad -ivestmet Discrete Time

More information

4 Multidimensional quantitative data

4 Multidimensional quantitative data Chapter 4 Multidimesioal quatitative data 4 Multidimesioal statistics Basic statistics are ow part of the curriculum of most ecologists However, statistical techiques based o such simple distributios as

More information

Stat 139 Homework 7 Solutions, Fall 2015

Stat 139 Homework 7 Solutions, Fall 2015 Stat 139 Homework 7 Solutios, Fall 2015 Problem 1. I class we leared that the classical simple liear regressio model assumes the followig distributio of resposes: Y i = β 0 + β 1 X i + ɛ i, i = 1,...,,

More information

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator Ecoomics 24B Relatio to Method of Momets ad Maximum Likelihood OLSE as a Maximum Likelihood Estimator Uder Assumptio 5 we have speci ed the distributio of the error, so we ca estimate the model parameters

More information

Correlation and Covariance

Correlation and Covariance Correlatio ad Covariace Tom Ilveto FREC 9 What is Next? Correlatio ad Regressio Regressio We specify a depedet variable as a liear fuctio of oe or more idepedet variables, based o co-variace Regressio

More information

STA6938-Logistic Regression Model

STA6938-Logistic Regression Model Dr. Yig Zhag STA6938-Logistic Regressio Model Topic -Simple (Uivariate) Logistic Regressio Model Outlies:. Itroductio. A Example-Does the liear regressio model always work? 3. Maximum Likelihood Curve

More information

This is an introductory course in Analysis of Variance and Design of Experiments.

This is an introductory course in Analysis of Variance and Design of Experiments. 1 Notes for M 384E, Wedesday, Jauary 21, 2009 (Please ote: I will ot pass out hard-copy class otes i future classes. If there are writte class otes, they will be posted o the web by the ight before class

More information

Section 14. Simple linear regression.

Section 14. Simple linear regression. Sectio 14 Simple liear regressio. Let us look at the cigarette dataset from [1] (available to dowload from joural s website) ad []. The cigarette dataset cotais measuremets of tar, icotie, weight ad carbo

More information

CLRM estimation Pietro Coretto Econometrics

CLRM estimation Pietro Coretto Econometrics Slide Set 4 CLRM estimatio Pietro Coretto pcoretto@uisa.it Ecoometrics Master i Ecoomics ad Fiace (MEF) Uiversità degli Studi di Napoli Federico II Versio: Thursday 24 th Jauary, 2019 (h08:41) P. Coretto

More information

It is always the case that unions, intersections, complements, and set differences are preserved by the inverse image of a function.

It is always the case that unions, intersections, complements, and set differences are preserved by the inverse image of a function. MATH 532 Measurable Fuctios Dr. Neal, WKU Throughout, let ( X, F, µ) be a measure space ad let (!, F, P ) deote the special case of a probability space. We shall ow begi to study real-valued fuctios defied

More information

1 General linear Model Continued..

1 General linear Model Continued.. Geeral liear Model Cotiued.. We have We kow y = X + u X o radom u v N(0; I ) b = (X 0 X) X 0 y E( b ) = V ar( b ) = (X 0 X) We saw that b = (X 0 X) X 0 u so b is a liear fuctio of a ormally distributed

More information

Full file at

Full file at Chapter Ecoometrics There are o exercises or applicatios i Chapter. 0 Pearso Educatio, Ic. Publishig as Pretice Hall Chapter The Liear Regressio Model There are o exercises or applicatios i Chapter. 0

More information

If, for instance, we were required to test whether the population mean μ could be equal to a certain value μ

If, for instance, we were required to test whether the population mean μ could be equal to a certain value μ STATISTICAL INFERENCE INTRODUCTION Statistical iferece is that brach of Statistics i which oe typically makes a statemet about a populatio based upo the results of a sample. I oesample testig, we essetially

More information

Comparing Two Populations. Topic 15 - Two Sample Inference I. Comparing Two Means. Comparing Two Pop Means. Background Reading

Comparing Two Populations. Topic 15 - Two Sample Inference I. Comparing Two Means. Comparing Two Pop Means. Background Reading Topic 15 - Two Sample Iferece I STAT 511 Professor Bruce Craig Comparig Two Populatios Research ofte ivolves the compariso of two or more samples from differet populatios Graphical summaries provide visual

More information

Chimica Inorganica 3

Chimica Inorganica 3 himica Iorgaica Irreducible Represetatios ad haracter Tables Rather tha usig geometrical operatios, it is ofte much more coveiet to employ a ew set of group elemets which are matrices ad to make the rule

More information

6.3 Testing Series With Positive Terms

6.3 Testing Series With Positive Terms 6.3. TESTING SERIES WITH POSITIVE TERMS 307 6.3 Testig Series With Positive Terms 6.3. Review of what is kow up to ow I theory, testig a series a i for covergece amouts to fidig the i= sequece of partial

More information

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4 MATH 30: Probability ad Statistics 9. Estimatio ad Testig of Parameters Estimatio ad Testig of Parameters We have bee dealig situatios i which we have full kowledge of the distributio of a radom variable.

More information

Regression, Part I. A) Correlation describes the relationship between two variables, where neither is independent or a predictor.

Regression, Part I. A) Correlation describes the relationship between two variables, where neither is independent or a predictor. Regressio, Part I I. Differece from correlatio. II. Basic idea: A) Correlatio describes the relatioship betwee two variables, where either is idepedet or a predictor. - I correlatio, it would be irrelevat

More information

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. Comments:

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. Comments: Recall: STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS Commets:. So far we have estimates of the parameters! 0 ad!, but have o idea how good these estimates are. Assumptio: E(Y x)! 0 +! x (liear coditioal

More information

September 2012 C1 Note. C1 Notes (Edexcel) Copyright - For AS, A2 notes and IGCSE / GCSE worksheets 1

September 2012 C1 Note. C1 Notes (Edexcel) Copyright   - For AS, A2 notes and IGCSE / GCSE worksheets 1 September 0 s (Edecel) Copyright www.pgmaths.co.uk - For AS, A otes ad IGCSE / GCSE worksheets September 0 Copyright www.pgmaths.co.uk - For AS, A otes ad IGCSE / GCSE worksheets September 0 Copyright

More information

Question 1: Exercise 8.2

Question 1: Exercise 8.2 Questio 1: Exercise 8. (a) Accordig to the regressio results i colum (1), the house price is expected to icrease by 1% ( 100% 0.0004 500 ) with a additioal 500 square feet ad other factors held costat.

More information

Assessment and Modeling of Forests. FR 4218 Spring Assignment 1 Solutions

Assessment and Modeling of Forests. FR 4218 Spring Assignment 1 Solutions Assessmet ad Modelig of Forests FR 48 Sprig Assigmet Solutios. The first part of the questio asked that you calculate the average, stadard deviatio, coefficiet of variatio, ad 9% cofidece iterval of the

More information

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen) Goodess-of-Fit Tests ad Categorical Data Aalysis (Devore Chapter Fourtee) MATH-252-01: Probability ad Statistics II Sprig 2019 Cotets 1 Chi-Squared Tests with Kow Probabilities 1 1.1 Chi-Squared Testig................

More information

TAMS24: Notations and Formulas

TAMS24: Notations and Formulas TAMS4: Notatios ad Formulas Basic otatios ad defiitios X: radom variable stokastiska variabel Mea Vätevärde: µ = X = by Xiagfeg Yag kpx k, if X is discrete, xf Xxdx, if X is cotiuous Variace Varias: =

More information

(X i X)(Y i Y ) = 1 n

(X i X)(Y i Y ) = 1 n L I N E A R R E G R E S S I O N 10 I Chapter 6 we discussed the cocepts of covariace ad correlatio two ways of measurig the extet to which two radom variables, X ad Y were related to each other. I may

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Aalysis ad Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasii/teachig.html Suhasii Subba Rao Review of testig: Example The admistrator of a ursig home wats to do a time ad motio

More information