THE ROYAL STATISTICAL SOCIETY 2006 EXAMINATIONS SOLUTIONS GRADUATE DIPLOMA APPLIED STATISTICS PAPER I

Size: px
Start display at page:

Download "THE ROYAL STATISTICAL SOCIETY 2006 EXAMINATIONS SOLUTIONS GRADUATE DIPLOMA APPLIED STATISTICS PAPER I"

Transcription

1 THE ROYAL STATISTICAL SOCIETY 006 EXAMINATIONS SOLUTIONS GRADUATE DIPLOMA APPLIED STATISTICS PAPER I The Socety provdes these solutons to assst canddates preparng for the examnatons n future years and for the nformaton of any other persons usng the examnatons. The solutons should NOT be seen as "model answers". Rather, they have been wrtten out n consderable detal and are ntended as learnng ads. Users of the solutons should always be aware that n many cases there are vald alternatve methods. Also, n the many cases where dscusson s called for, there may be other vald ponts that could be made. Whle every care has been taken wth the preparaton of these solutons, the Socety wll not be responsble for any errors or omssons. The Socety wll not enter nto any correspondence n respect of these solutons. Note. In accordance wth the conventon used n the Socety's examnaton papers, the notaton log denotes logarthm to base e. Logarthms to any other base are explctly dentfed, e.g. log 10. RSS 006

2 Graduate Dploma, Appled Statstcs, Paper I, 006. Queston 1 () The AR(p) tme seres model s Yt = φ0 + φ1yt 1+ φyt φpyt p + εt where Y t represents the seres at tme t and φ 0, φ 1,, φ p are constants. The {ε t } are "pure error" or "whte nose" random terms, ndependently dentcally dstrbuted N(0, σ ε ), and not correlated wth {Y t }. The MA(q) tme seres model s Y t = ε t + θε 1 t 1+ θε t θqεt q where {Y t }, {ε t } are as above and θ 1, θ,, θ q are constants. ()(a) Yt 5 εt 0.3ε t 1 = + s a statonary process. [ ] 0 [ ] 5 [ ε ] 0.3 [ ] EY = + E E =5. t t εt 1 Also, agan usng the condtons n part (), ( Y ) ( ) ( ) ( ) E ε = (as n part ()), so t εt εt 1 σε Var = 0 + Var Var = The autocovarance s ( Y Y ) ( ) γ = Cov, = Cov 5 + ε 0.3 ε, 5 + ε 0.3ε k t t k t t 1 t k t k 1 ( ) 0.3Var εt 1 = 0.3σε f k = 1 = 0.3Var ( εt ) = 0.3σε f k = 1 0 otherwse t Thus the autocorrelaton ρ k s 0.3 = 0.75 for k = ± 1, and 0 otherwse So the partal autocorrelaton functon decays to 0 and s negatve. Soluton contnued on next page

3 ()(b) Y t = Y t + ε t s statonary (ths s gven n the queston). Also, E[Y t ] = E[Y t ], so [ ] 34 EY t = = ( ) = ( ) ( ) + σ, so ( Y ) Var Yt 0.1 Var Yt ε For the autocovarance, consder σ ε Var t = = 1.01σ ε. 1 (0.1) ( Y Y ) = ( Y Y + ε ) 0.1 cov (, ) Cov, Cov, t t k t t k t k =. Yt Yt k γ k 0.1γ k = and ρk = 0.1ρ k. However, ρ 1 = 0 and ρ = 0.1. Hence we have ρ 3 = ρ 5 = ρ 7 = = 0, and also ρ = 0.1, ρ 4 = (0.1), ρ 6 = (0.1) 3, ρ 8 = (0.1) 4,. The partal autocorrelaton functon cuts off sharply after k = (wth a negatve spke at k = ). () Yt = 54 0.Yt 1 + εt. ( ) 1+ 0.LY= 54+ where L represents the "backward shft" operator. ( 1+ 0.L) t εt ( 1 0. ( 0.) ( 0.3 ) )( 54 + εt ) 54 + εt Y = = L+ L L + t ( ) ( ) 3 t t 1 t t 3. = 54 + ε 0.ε + 0. ε 0. ε +...

4 Graduate Dploma, Appled Statstcs, Paper I, 006. Queston () (a) Varance and covarance depend on the unts of measurement, but correlaton does not. These varables are n completely dfferent unts of measurement, and ths would serously nfluence the results f the covarance matrx were used. There s also the ssue (see ()(a)) of whether, for example, length of road should be measured n mles or klometres. It would not be approprate to use the covarance matrx. (b) When the prncpal components are arranged n order of sze of egenvalues, as here, "cumulatve proportons" are found by addng egenvalues from the left, endng wth a total (n ths example) of 4, the number of varables. The "proportons" for each egenvalue here are = 55.75%, 100 = 33.5, 100 = 6.5%, so the cumulatve proportons are 55.75, 89.00, 95.5 and 100%. (c) Most of the varaton s explaned by PC1 and PC, leavng only 11% for PC3 and PC4 together. An explanaton of the data mght therefore be gven n terms of these two. The coeffcents n the components are the weghtngs of the four varables n each one. PC1 s a (weghted) average of all four varables, as often happens when usng the correlaton matrx; populaton s of less mportance than the other three varables. PC s manly a contrast between populaton and length of road. Ths mght reflect the effect of length of road outsde man populaton centres. PC3 seems to be a contrast between number of drvers and (populaton densty, length of road) but does not contrbute much to the total. It mght be some form of measure of the number of drvers outsde man populaton centres. PC4 s a contrast between fuel consumpton and the others, but contrbutes lttle. Perhaps t s manly a contrast between number of drvers and fuel consumpton, and thus could gve a measure of the number of drvers wth hgh fuel consumpton. (Note that components that contrbute lttle and mght reasonably be gnored can sometmes gve nformaton about what s not mportant.) Soluton contnued on next page

5 () (a) Snce the correlaton matrx has been used, changng the unts as suggested would make no dfference to the results. Identcal prncpal components and regresson analyses would be obtaned. The whole analyss s ndependent of the orgnal scales of measurement. (b) A smple explanaton, wth farly hgh R, comes from PC1, and shows that deaths are postvely related to number of drvers, length of road and fuel consumpton, but depend lttle on populaton densty. Addng PC3 and PC4 mproves the explanaton and leads to a very hgh R. Note that the nterpretatons depend on the sgns of the coeffcents. () Unless there are correlatons among the predctor varables (whch of course there often are), there s nothng to gan from usng prncpal components. The prncpal components themselves are uncorrelated, so ths makes model selecton easer f a "sutable" subset s known. However, the prncpal components are not always easy to nterpret as they are purely mathematcal constructs, and thus the resultng regresson models may not be easy to nterpret. (Note for example that the negatve coeffcent of PC4 n the regresson analyss n part () suggests that the number of deaths s postvely related to drvers wth low fuel consumpton, whch seems strange.) It can be dffcult to choose a "sutable" subset of prncpal components. In ths example, PC appeared mportant n the explanaton of the data n part () but dd not enter the regresson analyss n part (). On the other hand, t can happen that components wth low egenvalues are mportant n the regresson as appears to have happened wth PC3 and PC4 here. One way to try to deal wth ths s to nclude the response varable n the prncpal components analyss and select PCs wth hgh coeffcents of the response varable and large egenvalues; but there s no guarantee that mportant regresson varables wll be dentfed.

6 Graduate Dploma, Appled Statstcs, Paper I, 006. Queston 3 () Both methods can be carred out on data from groups of tems for whch there are multple measurements. In cluster analyss the groups are not known already, but are constructed from the data; the analyss nvestgates possble groupngs, based on the multple measurements. Dscrmnant analyss s carred out when the (two or more) groups are known already and the am s to fnd the (lnear) combnaton of the varables whch best separates the tems nto ther groups. The method assumes that the data n each group are from multvarate Normal dstrbutons wth smlar varance-covarance structures. As an example, dscrmnant analyss would be approprate for dstngushng between those anmals whch survve n a hard envronment and those whch do not, usng a range of physologcal measurements on bodes. A sutable dscrmnatory combnaton of these measurements would have been found from several speces for whch t was known whether or not they survve. The same combnaton would then be appled to measurements on a dfferent speces to gve a predcton of whether or not t would be lkely to survve. () (a) For d(x, y) to be a proper dstance measure for X and Y, we requre: d(x, x) = 0; d(x, y) = d(y, x); d(x, y) d(x, z) + d(z, y) for all Z. Certanly d(a, A) = d(b, B) = d(c, C) = 0, from the dagonal entres. Also, d(a, B) = 3.9 = d(b, A); and smlarly for d(a, C) and d(b, C). Fnally, d(a, C) = 5.5, wth d(a, B) + d(b, C) = = 6.9 > 5.5; d(b, C) = 3.0, wth d(b, A) + d(a, C) = = 9.4 > 3.0; and d(a, B) = 3.9 wth d(a, C) + d(c, B) = = 8.5 > 3.9. (b) At stage 1, tems A and H have dstance 1.0 and these form a cluster. Cluster s D and E wth dstance 1.7, so stage has two clusters, AH and DE. The next cluster s BC wth dstance 3.0. The next dstance s 3.7, for AF and for BH; as shown n the dendrogram, we therefore combne ABCFH as a cluster at ths dstance, from whch the cluster DE s stll separate. The next dstance s 4.6, for CD and CG; thus at ths dstance we are down to a sngle cluster. [Note. In any practcal case we may specfy the number of clusters requred as a reasonable summary of the data or we may specfy the maxmum dstance for combnng clusters. As an llustraton of the latter, suppose we specfed dstance 3.1; at ths dstance there are 3 clusters, AH, BC and DE, and the other tems stand on ther own.] (c) The dendrogram appears rather "straggly". A and H seem closely smlar, as do D and E, but otherwse there are no very clear and concse clusters. There s not really any strong evdence of "dstnct categores" by the sngle lnkage cluster analyss used here, though dfferent methods of clusterng mght gve dfferent structures. The dendrogram s on the next page

7 Dstance A H F B C D E G Item

8 Graduate Dploma, Appled Statstcs, Paper I, 006. Queston 4 () (a) The probablty functon of the bnomal dstrbuton B(m, π) wrtten n exponental form s m f y y m y y (, π ) = exp log + logπ + log ( 1 π) log ( 1 π) π log, = log + log 1 + constant 1 π so f ( y π ) y m ( π ). As y s on the rght of ths equaton, t s n canoncal form, and the multpler of y s the natural parameter, whch s therefore π log. 1 π [Ths s the log-odds, or logt.] The generalsed lnear model sets ths lnk functon equal to the lnear predctor. (b) The odds s the rato of probabltes of "success" and "falure" for Y,.e. π /(1 π ). The log-odds s smply the logarthm (base e) of ths, as used n the lnk functon. After the generalsed lnear model has been ftted, the (estmated) value of η s obtaned ths s the estmate of the log-odds. We have π π η = log = exp( η) 1 π 1 π so the estmate of the odds s exp(η ). Gven the necessary standard errors (SE), approxmate 95% confdence lmts are "estmate ± 1.96 SE" for each of odds and logodds. The detals are n ()(c) below. () (a) We are not told how the samplng was carred out, so the ndependence of observatons s not guaranteed; nether s the randomness. The analyss would be approprate f a random sample of data from a larger populaton has been selected, omttng multple brths (twns etc) and only usng a mother once f she has had more than one chld at dfferent tmes [to avod probable lack of ndependence]. Many hosptals would need to be represented n the samplng, as well as home brths. It would not be approprate to use ths analyss f the "group of women" mentoned came from a lmted area, for example by studyng all brths from the local hosptal over a few years. Soluton contnued on next page

9 (b) Step 1 chooses the sngle predctor varable whch reduces the scaled devance as much as possble from the "constant only" model. Clearly ths s GEST, the length of gestaton perod, whch reduces the devance by (on 1 df). We next consder addng AGE, and ths step further reduces the devance by 6.566, also on 1 df; ths s sgnfcant as an observaton from χ wth 1 df, so AGE should be ncluded. So we use AGE and GEST n the model. (c) The codng AGE = 0, GEST = 0 gves ( ˆ η) ˆ η = , SE = The estmate of the odds s exp( ) = % confdence lmts for η are ± ,.e. (.00, 1.51), so the correspondng lmts for the odds are (0.137, 0.05) after exponentatng. The estmate of the probablty of mortalty s ˆ η e ˆ π = = = ˆ η 1+ e Smlarly, the upper and lower lmts of the 95% confdence nterval for ths probablty are as follows. Lower lmt: = ; upper lmt: =. (d) GEST s now to be coded as 1, AGE remanng zero. The log-odds rato for ths group compared to the group n (c) s thus smply the value of the GEST parameter,.e So the odds rato s exp( 3.886) = % confdence lmts for ths log-odds rato are ± ,.e. ( 3.650,.97). Thus the lmts for the odds rato are exp( 3.650) = 0.06 and exp(.97) =

10 Graduate Dploma, Appled Statstcs, Paper I, 006. Queston 5 = + +, where the { ε } ()(a) The approprate model for log(y) s log y log a bx ε ' are ndependent dentcally Normally dstrbuted errors, N(0, σ ). ' Thus for the untransformed data we have lognormal. bx y ae = ε where the ε are ()(b) Denote log(y) by Y and log(a) by A so that Y = A + bx + ε '. The normal equatons are obtaned by mnmsng ( ). S = Y A bx Dfferentatng wth respect to A and B leads to the followng (strctly speakng t should be checked that these are mnmsng values). S = ( Y A bx ). A Settng ths equal to 0 gves  = Y bx ˆ. Y = Aˆ + bx ˆ as one normal equaton. So S = ( ). b x Y A bx Settng ths equal to 0 gves ˆ ˆ Thus x Y = ( Y bx ) x + b x x Y Aˆ x bˆ = + x. ΣxΣy ΣxY ˆ n nσxy ( Σx)( ΣY) b = or ( Σx ) nσx ( Σx) Σx n Soluton contnued on next page

11 bx () Here we smlarly mnmse S = ( y ) ae as follows. S bx = bx ( y ae ) e. a, gvng two normal equatons Settng ths equal to 0 gves bx ˆ ˆ bx ye ˆ = a e. s = bx ( ). b bx y ae axe Settng ths equal to 0 gves bx ˆ ˆ bx ˆ = or aˆ x ye a xe bx ˆ ˆ bx x ˆ ye = a xe. From the frst normal equaton we have Insertng ths n the second gves aˆ = bx ˆ ye e bx ˆ. ˆ bx ˆ ˆ ˆ bx bx bx xye e = ye xe (Ths would requre an teratve method of soluton such as Newton- Raphson.) () (a) Frst model: aˆ = e =.504; b = ˆ Second model: aˆ =.45; bˆ = 1.0. The estmates are very smlar. (b) (c) The stuaton s not clear-cut. The standardsed resduals from the frst model are perhaps more scattered than those from the second, and wth an appearance of becomng less varable as x ncreases. For the second model, there s only one outler among the resduals but the pattern may be slghtly skew. The outler n ths model corresponds to the pont where there s a slght jump n the orgnal scatter dagram, and ths also produces a somewhat hgh standardsed resdual from the frst model. Perhaps there s a very slght preference for the second model. Any exstng knowledge about the system wll be helpful, as wll any pror nformaton on the error structure s t Normal or lognormal (or nether)?

12 Graduate Dploma, Appled Statstcs, Paper I, 006. Queston 6 () The sngle varable whch reduces the resdual sum of squares most from the model wth ntercept alone s x 4, so forward selecton frst ntroduces ths varable. It then examnes addng another varable to x 4. The smallest resdual sum of squares for x 4 and one other varable s when x 1 s taken wth x 4. Next to be entered would be x but ths does not seem to make much dfference compared wth (x 1, x 4 ). Addng x 3 as well does not seem to make a worthwhle dfference ether. Formal tests for sgnfcance at each stage are as follows. (1) Addng x 4. The SS for addng x 4 s = , and the remanng resdual then has 11 df. So the test statstc s =.80, 11 whch s very hghly sgnfcant as an observaton from F 1,11 ; there s very strong evdence that x 4 should be ncluded n the model. () Addng x 1. The SS for dong ths s = , and the remanng resdual has 10 df. The test statstc s = 108., whch s very hghly sgnfcant as an observaton from F 1,10 ; there s very strong evdence that x 1 should also be ncluded n the model. (3) Addng x. The SS for dong ths s = 6.789, and the remanng resdual has 9 df. The test statstc s = 5.03, whch s not (qute) sgnfcant at the 5% level as an observaton from F 1,9 (the 5% crtcal pont s 5.1). Judged at the 5% level, there s no evdence that x should also be ncluded n the model. Thus the model reached by forward selecton has x 1 and x 4. Soluton contnued on next page

13 () Not f there are correlatons among the predctor varables. Startng from the full model and omttng varables can gve very dfferent results from forward selecton. () Mallows' C p s a dagnostc statstc desgned to help n dentfyng a "best subset" model. The defnton of C p s where C p RSS p = p s ( n ) n s the number of data values, p s the number of parameters (ncludng the constant) n the model beng nvestgated, RSS p s the resdual sum of squares for the model beng nvestgated, s s the resdual mean square for the full model usng all possble predctors. It can be shown that, for a satsfactory model, E[C p ] = p. Lookng at the values of C p gven n the queston, ths suggests that the (x 1, x 4 ) model may be not qute "good enough" as t has a C p value of 5.50 as opposed to an expectaton of. All of the three-predctor models seem satsfactory apart from (x, x 3, x 4 ). (v) The two-predctor model (x 1, x ) appears good, both from ts resdual sum of squares and from ts C p value. It also has a hgher R value than any of the other models wth fewer than three predctors. Ths model has not appeared n the forward selecton because nether x 1 nor x was the frst varable chosen, but t appears to be the best parsmonous model. (v) It s always mportant to have nformaton about the practcal stuaton, to avod proposng a model that researchers or practtoners n the feld consder unreasonable. They may do so from ther general techncal knowledge of the feld and/or n the lght of earler work whch has shown that some potental varables n fact have lttle relaton to the response y. Correlatons among the x varables may, mathematcally, lead to such "unreasonable" models. If there s a choce, those varables that are easer and cheaper to measure accurately would often be preferred. Why have the varables x 1, x, x 3, x 4 n the example been suggested?

14 Graduate Dploma, Appled Statstcs, Paper I, 006. Queston 7 () y x There s a consderable amount of scatter but some ndcaton of a weak postve assocaton between y and x. The varance of the y varable looks as f t could be assumed constant there s no apparent pattern (such as a dependence on x). () Any assocaton that exsts between y and x s not obvously curved, and does appear to have a lnear component. So a lnear regresson model seems reasonable. Because there are repeat observatons on y at some of the x- values, a "pure error" term can be extracted from the resdual as the sum of squares between these repeats (see below). The remander of the resdual ("lack of ft") then represents departure from lnearty, whch can be tested aganst the "pure error". Ths should gve a better test of the lnear regresson model. The model s Y j = a + bx + ε j = 1,,, 13 j = 1 or or 3, dependng on the value of. where x s a value of x, Y j represent the (sngle or repeat) observatons taken at x = x, {ε j } are ndependent normal N(0, σ ) random varaton terms wth constant varance. Soluton contnued on next page

15 () (a) At x = 1., we have y =. and 1.9, wth total 4.1. So the pure error 4.1 SS here s = Ths has 1 degree of freedom. (b) (c) At x = 4.1, we have y =.8,.8 and.1, wth total 7.7. So the pure error SS here s = Ths has df. 3 At each x value where there are repeats, a smlar calculaton s carred out. The sums of squares are added to obtan the total pure error SS. The numbers of degrees of freedom would also be added to obtan the total df for "pure error", whch here wll be 9 (ths s needed below, n part (v)). (v) If the "pure error" SS s , the "lack of ft" SS must be =.837, and ths wll have 0 9 = 11 df. Hence the analyss of varance s Source of varaton df Sum of Mean F value squares square Regresson / = 5.50 Lack of ft / = 0.43 Pure error = ˆ σ Total The F value for regresson (note that ths s now a comparson wth the pure error term) s referred to the F 1,9 dstrbuton. Ths s sgnfcant at the 5% level (crtcal pont s 5.1), so there s some evdence n favour of the regresson model. There s no evdence of lack of ft. (It could be argued that the lack of ft and pure error SSs should therefore be recombned to gve the resdual as before, wth 0 df.) We note that R =.673/9.377 = 8.6%, whch s low; despte the absence of evdence for lack of ft, only about 9% of the varaton n the data s explaned by the lnear regresson model. Ths s because the underlyng varablty (estmated by the pure error mean square) s hgh. (v) Resduals after fttng the proposed model can be examned, and any patterns n them noted. Departures from the model can be detected n ths way, such as a need for an addtonal term, or systematc non-constant varance, etc.

16 Graduate Dploma, Appled Statstcs, Paper I, 006. Queston 8 Part () Totals are as follows For A A1: 56; A: 78; A3: 83. For B B1: 145; B: 7. Grand total: 17. Sum of squares of observatons: The (corrected) total sum of squares s 1977 = , wth 9 df. 30 The sum of squares for factor A s = 41.67, wth df The sum of squares for factor B s = , wth 1 df The sum of squares for the nteracton AB s SS for A SS for B= 6.067, wth df. The resdual SS s obtaned by subtracton. Ths has 9 1 = 4 df. A s a fxed factor n both (a) and (b) below; ts nterpretaton s the same n the two parts. B s fxed n (a) and random n (b), so ts nterpretaton s dfferent n the two parts, and ths also apples for the nteracton AB. Soluton contnued on next page

17 (a) The model s ( ) y = μ + α + β + αβ + ε jk j j jk = 1,, 3; j = 1, ; k = 1,,, 5 α = 0, β j = 0, ( αβ ) = 0, ( αβ ) = 0 j. j j j Here μ, α, β j, (αβ) j are all unknown constants, representng the grand mean, the effect of level of A, the effect of level j of B, and the nteracton between A at level and B at level j, respectvely The "error" terms {ε jk } are ndependent Normal random varables, N(0, σ ), wth σ constant. The analyss of varance s as follows. A column of expected mean squares (E[MS]) s nserted n the table. Source of varaton df Sum of squares Mean square E[MS] A σ 5 ( ) B ( ) AB σ ( 5 ) 1 F value + Σ α 0.633/5.67 = 3.9 σ + Σ β j /5.67 = ΣΣ( αβ ) j /5.67 = 5.89 Resdual = ˆ σ Total Tests for the null hypotheses "all α = 0", "all β j = 0", "all (αβ) j = 0" use the gven F values. The upper 5% pont of F,4 s 3.40 and the upper 1% pont s 5.61, so the frst of these s sgnfcant at the 5% level and the thrd at the 1% level. For F 1,4, the upper 0.1% pont s 14.03, so the second s very hghly sgnfcant. There s evdence that both man effects and the nteracton are mportant. As the nteracton s sgnfcant, the nterpretaton should be based on a -way table of AB means. The response for B s much the same at each level of A; however, for B1 the response at A1 s well below the other two, wth A3 a lttle larger than A. Soluton contnued on next page

18 (b) The model s ( ) y = μ + α + β + αβ + ε jk j j jk = 1,, 3; j = 1, ; k = 1,,, 5. Here μ and the α are unknown constants, wth Σα = 0, as before, and the "error" terms {ε jk } are ndependent Normal random varables, N(0, σ ), wth σ constant, as before. The β j are uncorrelated random varables, uncorrelated wth the (αβ) j and wth the {ε jk }, wth mean 0 and constant varance σ B. The (αβ) j represent the nteracton. For each level of A (.e. for each ) they are uncorrelated random varables, uncorrelated wth the β j and wth the {ε jk }, wth mean 0 and constant varance σ AB. For each level of B (.e. for each j) they are constants wth Σ ( αβ ) =. j 0 Assumptons of Normalty for the β j and (αβ) j are added for the formal nferences, so that these become ndependent Normal random varables. The analyss of varance s as follows. A column of expected mean squares (E[MS]) s agan nserted n the table. Source of varaton df Sum of squares Mean square E[MS] σ + σ + Σ α A ( 5 ) B AB AB σ + (5 3) σ B σ + 5σ AB Resdual σ F value 0.633/ = /5.67 = /5.67 = 5.89 Total The null hypotheses are "all α = 0", "σ B = 0" and "σ AB = 0". The expected mean squares ndcate how these are tested. Soluton contnued on next page

19 The F value to test the frst null hypothess s 0.665, whch s referred to F,. Ths s not sgnfcant; there s no evdence aganst ths null hypothess. The F value to test the second s 33.73, whch s referred to F 1,4. Ths s very hghly sgnfcant (see part (a) above); there s very strong evdence aganst ths null hypothess. The F value to test the thrd s 5.89, whch s referred to F,4. Ths s sgnfcant at the 5% level (see part (a) above); there s evdence aganst ths null hypothess. Estmates of σ B and σ AB can be obtaned f requred. The explanaton should be n terms of there beng evdence for varaton among the effects of the levels of B n the underlyng populaton of all such levels, and smlarly for the AB nteracton. Part () Suppose that A1, A, A3 are three alternatve cultvaton treatments n an agrcultural tral, whle B1, B are two stes upon whch that experment s carred out. If B1, B are the only two avalable stes about whch nferences are to be made, (a) s approprate. If B1, B are selected (at random) from several avalable stes and nference s to be made for the whole collecton of stes, (b) s approprate.

THE ROYAL STATISTICAL SOCIETY 2006 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE

THE ROYAL STATISTICAL SOCIETY 2006 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE THE ROYAL STATISTICAL SOCIETY 6 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE PAPER I STATISTICAL THEORY The Socety provdes these solutons to assst canddates preparng for the eamnatons n future years and for

More information

Negative Binomial Regression

Negative Binomial Regression STATGRAPHICS Rev. 9/16/2013 Negatve Bnomal Regresson Summary... 1 Data Input... 3 Statstcal Model... 3 Analyss Summary... 4 Analyss Optons... 7 Plot of Ftted Model... 8 Observed Versus Predcted... 10 Predctons...

More information

Chapter 13: Multiple Regression

Chapter 13: Multiple Regression Chapter 13: Multple Regresson 13.1 Developng the multple-regresson Model The general model can be descrbed as: It smplfes for two ndependent varables: The sample ft parameter b 0, b 1, and b are used to

More information

Comparison of Regression Lines

Comparison of Regression Lines STATGRAPHICS Rev. 9/13/2013 Comparson of Regresson Lnes Summary... 1 Data Input... 3 Analyss Summary... 4 Plot of Ftted Model... 6 Condtonal Sums of Squares... 6 Analyss Optons... 7 Forecasts... 8 Confdence

More information

Chapter 11: Simple Linear Regression and Correlation

Chapter 11: Simple Linear Regression and Correlation Chapter 11: Smple Lnear Regresson and Correlaton 11-1 Emprcal Models 11-2 Smple Lnear Regresson 11-3 Propertes of the Least Squares Estmators 11-4 Hypothess Test n Smple Lnear Regresson 11-4.1 Use of t-tests

More information

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4) I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes

More information

Statistics for Economics & Business

Statistics for Economics & Business Statstcs for Economcs & Busness Smple Lnear Regresson Learnng Objectves In ths chapter, you learn: How to use regresson analyss to predct the value of a dependent varable based on an ndependent varable

More information

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 30 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 2 Remedes for multcollnearty Varous technques have

More information

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics ECOOMICS 35*-A Md-Term Exam -- Fall Term 000 Page of 3 pages QUEE'S UIVERSITY AT KIGSTO Department of Economcs ECOOMICS 35* - Secton A Introductory Econometrcs Fall Term 000 MID-TERM EAM ASWERS MG Abbott

More information

STAT 511 FINAL EXAM NAME Spring 2001

STAT 511 FINAL EXAM NAME Spring 2001 STAT 5 FINAL EXAM NAME Sprng Instructons: Ths s a closed book exam. No notes or books are allowed. ou may use a calculator but you are not allowed to store notes or formulas n the calculator. Please wrte

More information

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands Content. Inference on Regresson Parameters a. Fndng Mean, s.d and covarance amongst estmates.. Confdence Intervals and Workng Hotellng Bands 3. Cochran s Theorem 4. General Lnear Testng 5. Measures of

More information

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction ECONOMICS 5* -- NOTE (Summary) ECON 5* -- NOTE The Multple Classcal Lnear Regresson Model (CLRM): Specfcaton and Assumptons. Introducton CLRM stands for the Classcal Lnear Regresson Model. The CLRM s also

More information

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition)

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition) Count Data Models See Book Chapter 11 2 nd Edton (Chapter 10 1 st Edton) Count data consst of non-negatve nteger values Examples: number of drver route changes per week, the number of trp departure changes

More information

Chapter 8 Indicator Variables

Chapter 8 Indicator Variables Chapter 8 Indcator Varables In general, e explanatory varables n any regresson analyss are assumed to be quanttatve n nature. For example, e varables lke temperature, dstance, age etc. are quanttatve n

More information

Department of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution

Department of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution Department of Statstcs Unversty of Toronto STA35HS / HS Desgn and Analyss of Experments Term Test - Wnter - Soluton February, Last Name: Frst Name: Student Number: Instructons: Tme: hours. Ads: a non-programmable

More information

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U) Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of

More information

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA 4 Analyss of Varance (ANOVA) 5 ANOVA 51 Introducton ANOVA ANOVA s a way to estmate and test the means of multple populatons We wll start wth one-way ANOVA If the populatons ncluded n the study are selected

More information

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6 Department of Quanttatve Methods & Informaton Systems Tme Seres and Ther Components QMIS 30 Chapter 6 Fall 00 Dr. Mohammad Zanal These sldes were modfed from ther orgnal source for educatonal purpose only.

More information

Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation

Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation Statstcs for Managers Usng Mcrosoft Excel/SPSS Chapter 13 The Smple Lnear Regresson Model and Correlaton 1999 Prentce-Hall, Inc. Chap. 13-1 Chapter Topcs Types of Regresson Models Determnng the Smple Lnear

More information

/ n ) are compared. The logic is: if the two

/ n ) are compared. The logic is: if the two STAT C141, Sprng 2005 Lecture 13 Two sample tests One sample tests: examples of goodness of ft tests, where we are testng whether our data supports predctons. Two sample tests: called as tests of ndependence

More information

January Examinations 2015

January Examinations 2015 24/5 Canddates Only January Examnatons 25 DO NOT OPEN THE QUESTION PAPER UNTIL INSTRUCTED TO DO SO BY THE CHIEF INVIGILATOR STUDENT CANDIDATE NO.. Department Module Code Module Ttle Exam Duraton (n words)

More information

UNIVERSITY OF TORONTO Faculty of Arts and Science. December 2005 Examinations STA437H1F/STA1005HF. Duration - 3 hours

UNIVERSITY OF TORONTO Faculty of Arts and Science. December 2005 Examinations STA437H1F/STA1005HF. Duration - 3 hours UNIVERSITY OF TORONTO Faculty of Arts and Scence December 005 Examnatons STA47HF/STA005HF Duraton - hours AIDS ALLOWED: (to be suppled by the student) Non-programmable calculator One handwrtten 8.5'' x

More information

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore Sesson Outlne Introducton to classfcaton problems and dscrete choce models. Introducton to Logstcs Regresson. Logstc functon and Logt functon. Maxmum Lkelhood Estmator (MLE) for estmaton of LR parameters.

More information

STAT 3008 Applied Regression Analysis

STAT 3008 Applied Regression Analysis STAT 3008 Appled Regresson Analyss Tutoral : Smple Lnear Regresson LAI Chun He Department of Statstcs, The Chnese Unversty of Hong Kong 1 Model Assumpton To quantfy the relatonshp between two factors,

More information

Diagnostics in Poisson Regression. Models - Residual Analysis

Diagnostics in Poisson Regression. Models - Residual Analysis Dagnostcs n Posson Regresson Models - Resdual Analyss 1 Outlne Dagnostcs n Posson Regresson Models - Resdual Analyss Example 3: Recall of Stressful Events contnued 2 Resdual Analyss Resduals represent

More information

2016 Wiley. Study Session 2: Ethical and Professional Standards Application

2016 Wiley. Study Session 2: Ethical and Professional Standards Application 6 Wley Study Sesson : Ethcal and Professonal Standards Applcaton LESSON : CORRECTION ANALYSIS Readng 9: Correlaton and Regresson LOS 9a: Calculate and nterpret a sample covarance and a sample correlaton

More information

x i1 =1 for all i (the constant ).

x i1 =1 for all i (the constant ). Chapter 5 The Multple Regresson Model Consder an economc model where the dependent varable s a functon of K explanatory varables. The economc model has the form: y = f ( x,x,..., ) xk Approxmate ths by

More information

Lecture 6: Introduction to Linear Regression

Lecture 6: Introduction to Linear Regression Lecture 6: Introducton to Lnear Regresson An Manchakul amancha@jhsph.edu 24 Aprl 27 Lnear regresson: man dea Lnear regresson can be used to study an outcome as a lnear functon of a predctor Example: 6

More information

Chapter 15 Student Lecture Notes 15-1

Chapter 15 Student Lecture Notes 15-1 Chapter 15 Student Lecture Notes 15-1 Basc Busness Statstcs (9 th Edton) Chapter 15 Multple Regresson Model Buldng 004 Prentce-Hall, Inc. Chap 15-1 Chapter Topcs The Quadratc Regresson Model Usng Transformatons

More information

STATISTICS QUESTIONS. Step by Step Solutions.

STATISTICS QUESTIONS. Step by Step Solutions. STATISTICS QUESTIONS Step by Step Solutons www.mathcracker.com 9//016 Problem 1: A researcher s nterested n the effects of famly sze on delnquency for a group of offenders and examnes famles wth one to

More information

Lecture 4 Hypothesis Testing

Lecture 4 Hypothesis Testing Lecture 4 Hypothess Testng We may wsh to test pror hypotheses about the coeffcents we estmate. We can use the estmates to test whether the data rejects our hypothess. An example mght be that we wsh to

More information

Introduction to Regression

Introduction to Regression Introducton to Regresson Dr Tom Ilvento Department of Food and Resource Economcs Overvew The last part of the course wll focus on Regresson Analyss Ths s one of the more powerful statstcal technques Provdes

More information

Basically, if you have a dummy dependent variable you will be estimating a probability.

Basically, if you have a dummy dependent variable you will be estimating a probability. ECON 497: Lecture Notes 13 Page 1 of 1 Metropoltan State Unversty ECON 497: Research and Forecastng Lecture Notes 13 Dummy Dependent Varable Technques Studenmund Chapter 13 Bascally, f you have a dummy

More information

LECTURE 9 CANONICAL CORRELATION ANALYSIS

LECTURE 9 CANONICAL CORRELATION ANALYSIS LECURE 9 CANONICAL CORRELAION ANALYSIS Introducton he concept of canoncal correlaton arses when we want to quantfy the assocatons between two sets of varables. For example, suppose that the frst set of

More information

Statistics for Managers Using Microsoft Excel/SPSS Chapter 14 Multiple Regression Models

Statistics for Managers Using Microsoft Excel/SPSS Chapter 14 Multiple Regression Models Statstcs for Managers Usng Mcrosoft Excel/SPSS Chapter 14 Multple Regresson Models 1999 Prentce-Hall, Inc. Chap. 14-1 Chapter Topcs The Multple Regresson Model Contrbuton of Indvdual Independent Varables

More information

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding Recall: man dea of lnear regresson Lecture 9: Lnear regresson: centerng, hypothess testng, multple covarates, and confoundng Sandy Eckel seckel@jhsph.edu 6 May 8 Lnear regresson can be used to study an

More information

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding Lecture 9: Lnear regresson: centerng, hypothess testng, multple covarates, and confoundng Sandy Eckel seckel@jhsph.edu 6 May 008 Recall: man dea of lnear regresson Lnear regresson can be used to study

More information

ANOVA. The Observations y ij

ANOVA. The Observations y ij ANOVA Stands for ANalyss Of VArance But t s a test of dfferences n means The dea: The Observatons y j Treatment group = 1 = 2 = k y 11 y 21 y k,1 y 12 y 22 y k,2 y 1, n1 y 2, n2 y k, nk means: m 1 m 2

More information

Polynomial Regression Models

Polynomial Regression Models LINEAR REGRESSION ANALYSIS MODULE XII Lecture - 6 Polynomal Regresson Models Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur Test of sgnfcance To test the sgnfcance

More information

Resource Allocation and Decision Analysis (ECON 8010) Spring 2014 Foundations of Regression Analysis

Resource Allocation and Decision Analysis (ECON 8010) Spring 2014 Foundations of Regression Analysis Resource Allocaton and Decson Analss (ECON 800) Sprng 04 Foundatons of Regresson Analss Readng: Regresson Analss (ECON 800 Coursepak, Page 3) Defntons and Concepts: Regresson Analss statstcal technques

More information

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y) Secton 1.5 Correlaton In the prevous sectons, we looked at regresson and the value r was a measurement of how much of the varaton n y can be attrbuted to the lnear relatonshp between y and x. In ths secton,

More information

Limited Dependent Variables

Limited Dependent Variables Lmted Dependent Varables. What f the left-hand sde varable s not a contnuous thng spread from mnus nfnty to plus nfnty? That s, gven a model = f (, β, ε, where a. s bounded below at zero, such as wages

More information

Chapter 5 Multilevel Models

Chapter 5 Multilevel Models Chapter 5 Multlevel Models 5.1 Cross-sectonal multlevel models 5.1.1 Two-level models 5.1.2 Multple level models 5.1.3 Multple level modelng n other felds 5.2 Longtudnal multlevel models 5.2.1 Two-level

More information

Chapter 14 Simple Linear Regression

Chapter 14 Simple Linear Regression Chapter 4 Smple Lnear Regresson Chapter 4 - Smple Lnear Regresson Manageral decsons often are based on the relatonshp between two or more varables. Regresson analss can be used to develop an equaton showng

More information

Chapter 14: Logit and Probit Models for Categorical Response Variables

Chapter 14: Logit and Probit Models for Categorical Response Variables Chapter 4: Logt and Probt Models for Categorcal Response Varables Sect 4. Models for Dchotomous Data We wll dscuss only ths secton of Chap 4, whch s manly about Logstc Regresson, a specal case of the famly

More information

Chapter 9: Statistical Inference and the Relationship between Two Variables

Chapter 9: Statistical Inference and the Relationship between Two Variables Chapter 9: Statstcal Inference and the Relatonshp between Two Varables Key Words The Regresson Model The Sample Regresson Equaton The Pearson Correlaton Coeffcent Learnng Outcomes After studyng ths chapter,

More information

4.3 Poisson Regression

4.3 Poisson Regression of teratvely reweghted least squares regressons (the IRLS algorthm). We do wthout gvng further detals, but nstead focus on the practcal applcaton. > glm(survval~log(weght)+age, famly="bnomal", data=baby)

More information

Basic Business Statistics, 10/e

Basic Business Statistics, 10/e Chapter 13 13-1 Basc Busness Statstcs 11 th Edton Chapter 13 Smple Lnear Regresson Basc Busness Statstcs, 11e 009 Prentce-Hall, Inc. Chap 13-1 Learnng Objectves In ths chapter, you learn: How to use regresson

More information

Correlation and Regression. Correlation 9.1. Correlation. Chapter 9

Correlation and Regression. Correlation 9.1. Correlation. Chapter 9 Chapter 9 Correlaton and Regresson 9. Correlaton Correlaton A correlaton s a relatonshp between two varables. The data can be represented b the ordered pars (, ) where s the ndependent (or eplanator) varable,

More information

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Analyss of Varance and Desgn of Experment-I MODULE VII LECTURE - 3 ANALYSIS OF COVARIANCE Dr Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur Any scentfc experment s performed

More information

Correlation and Regression

Correlation and Regression Correlaton and Regresson otes prepared by Pamela Peterson Drake Index Basc terms and concepts... Smple regresson...5 Multple Regresson...3 Regresson termnology...0 Regresson formulas... Basc terms and

More information

NUMERICAL DIFFERENTIATION

NUMERICAL DIFFERENTIATION NUMERICAL DIFFERENTIATION 1 Introducton Dfferentaton s a method to compute the rate at whch a dependent output y changes wth respect to the change n the ndependent nput x. Ths rate of change s called the

More information

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers Psychology 282 Lecture #24 Outlne Regresson Dagnostcs: Outlers In an earler lecture we studed the statstcal assumptons underlyng the regresson model, ncludng the followng ponts: Formal statement of assumptons.

More information

First Year Examination Department of Statistics, University of Florida

First Year Examination Department of Statistics, University of Florida Frst Year Examnaton Department of Statstcs, Unversty of Florda May 7, 010, 8:00 am - 1:00 noon Instructons: 1. You have four hours to answer questons n ths examnaton.. You must show your work to receve

More information

DO NOT OPEN THE QUESTION PAPER UNTIL INSTRUCTED TO DO SO BY THE CHIEF INVIGILATOR. Introductory Econometrics 1 hour 30 minutes

DO NOT OPEN THE QUESTION PAPER UNTIL INSTRUCTED TO DO SO BY THE CHIEF INVIGILATOR. Introductory Econometrics 1 hour 30 minutes 25/6 Canddates Only January Examnatons 26 Student Number: Desk Number:...... DO NOT OPEN THE QUESTION PAPER UNTIL INSTRUCTED TO DO SO BY THE CHIEF INVIGILATOR Department Module Code Module Ttle Exam Duraton

More information

Economics 130. Lecture 4 Simple Linear Regression Continued

Economics 130. Lecture 4 Simple Linear Regression Continued Economcs 130 Lecture 4 Contnued Readngs for Week 4 Text, Chapter and 3. We contnue wth addressng our second ssue + add n how we evaluate these relatonshps: Where do we get data to do ths analyss? How do

More information

28. SIMPLE LINEAR REGRESSION III

28. SIMPLE LINEAR REGRESSION III 8. SIMPLE LINEAR REGRESSION III Ftted Values and Resduals US Domestc Beers: Calores vs. % Alcohol To each observed x, there corresponds a y-value on the ftted lne, y ˆ = βˆ + βˆ x. The are called ftted

More information

Linear Approximation with Regularization and Moving Least Squares

Linear Approximation with Regularization and Moving Least Squares Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...

More information

A Robust Method for Calculating the Correlation Coefficient

A Robust Method for Calculating the Correlation Coefficient A Robust Method for Calculatng the Correlaton Coeffcent E.B. Nven and C. V. Deutsch Relatonshps between prmary and secondary data are frequently quantfed usng the correlaton coeffcent; however, the tradtonal

More information

18. SIMPLE LINEAR REGRESSION III

18. SIMPLE LINEAR REGRESSION III 8. SIMPLE LINEAR REGRESSION III US Domestc Beers: Calores vs. % Alcohol Ftted Values and Resduals To each observed x, there corresponds a y-value on the ftted lne, y ˆ ˆ = α + x. The are called ftted values.

More information

Composite Hypotheses testing

Composite Hypotheses testing Composte ypotheses testng In many hypothess testng problems there are many possble dstrbutons that can occur under each of the hypotheses. The output of the source s a set of parameters (ponts n a parameter

More information

C/CS/Phy191 Problem Set 3 Solutions Out: Oct 1, 2008., where ( 00. ), so the overall state of the system is ) ( ( ( ( 00 ± 11 ), Φ ± = 1

C/CS/Phy191 Problem Set 3 Solutions Out: Oct 1, 2008., where ( 00. ), so the overall state of the system is ) ( ( ( ( 00 ± 11 ), Φ ± = 1 C/CS/Phy9 Problem Set 3 Solutons Out: Oct, 8 Suppose you have two qubts n some arbtrary entangled state ψ You apply the teleportaton protocol to each of the qubts separately What s the resultng state obtaned

More information

ANSWERS CHAPTER 9. TIO 9.2: If the values are the same, the difference is 0, therefore the null hypothesis cannot be rejected.

ANSWERS CHAPTER 9. TIO 9.2: If the values are the same, the difference is 0, therefore the null hypothesis cannot be rejected. ANSWERS CHAPTER 9 THINK IT OVER thnk t over TIO 9.: χ 2 k = ( f e ) = 0 e Breakng the equaton down: the test statstc for the ch-squared dstrbuton s equal to the sum over all categores of the expected frequency

More information

Statistics Chapter 4

Statistics Chapter 4 Statstcs Chapter 4 "There are three knds of les: les, damned les, and statstcs." Benjamn Dsrael, 1895 (Brtsh statesman) Gaussan Dstrbuton, 4-1 If a measurement s repeated many tmes a statstcal treatment

More information

Y = β 0 + β 1 X 1 + β 2 X β k X k + ε

Y = β 0 + β 1 X 1 + β 2 X β k X k + ε Chapter 3 Secton 3.1 Model Assumptons: Multple Regresson Model Predcton Equaton Std. Devaton of Error Correlaton Matrx Smple Lnear Regresson: 1.) Lnearty.) Constant Varance 3.) Independent Errors 4.) Normalty

More information

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Analyss of Varance and Desgn of Exerments-I MODULE III LECTURE - 2 EXPERIMENTAL DESIGN MODELS Dr. Shalabh Deartment of Mathematcs and Statstcs Indan Insttute of Technology Kanur 2 We consder the models

More information

LINEAR REGRESSION ANALYSIS. MODULE VIII Lecture Indicator Variables

LINEAR REGRESSION ANALYSIS. MODULE VIII Lecture Indicator Variables LINEAR REGRESSION ANALYSIS MODULE VIII Lecture - 7 Indcator Varables Dr. Shalabh Department of Maematcs and Statstcs Indan Insttute of Technology Kanpur Indcator varables versus quanttatve explanatory

More information

Statistics for Business and Economics

Statistics for Business and Economics Statstcs for Busness and Economcs Chapter 11 Smple Regresson Copyrght 010 Pearson Educaton, Inc. Publshng as Prentce Hall Ch. 11-1 11.1 Overvew of Lnear Models n An equaton can be ft to show the best lnear

More information

Primer on High-Order Moment Estimators

Primer on High-Order Moment Estimators Prmer on Hgh-Order Moment Estmators Ton M. Whted July 2007 The Errors-n-Varables Model We wll start wth the classcal EIV for one msmeasured regressor. The general case s n Erckson and Whted Econometrc

More information

x = , so that calculated

x = , so that calculated Stat 4, secton Sngle Factor ANOVA notes by Tm Plachowsk n chapter 8 we conducted hypothess tests n whch we compared a sngle sample s mean or proporton to some hypotheszed value Chapter 9 expanded ths to

More information

Chapter 6. Supplemental Text Material

Chapter 6. Supplemental Text Material Chapter 6. Supplemental Text Materal S6-. actor Effect Estmates are Least Squares Estmates We have gven heurstc or ntutve explanatons of how the estmates of the factor effects are obtaned n the textboo.

More information

ECONOMETRICS - FINAL EXAM, 3rd YEAR (GECO & GADE)

ECONOMETRICS - FINAL EXAM, 3rd YEAR (GECO & GADE) ECONOMETRICS - FINAL EXAM, 3rd YEAR (GECO & GADE) June 7, 016 15:30 Frst famly name: Name: DNI/ID: Moble: Second famly Name: GECO/GADE: Instructor: E-mal: Queston 1 A B C Blank Queston A B C Blank Queston

More information

Statistics II Final Exam 26/6/18

Statistics II Final Exam 26/6/18 Statstcs II Fnal Exam 26/6/18 Academc Year 2017/18 Solutons Exam duraton: 2 h 30 mn 1. (3 ponts) A town hall s conductng a study to determne the amount of leftover food produced by the restaurants n the

More information

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Analyss of Varance and Desgn of Experment-I MODULE VIII LECTURE - 34 ANALYSIS OF VARIANCE IN RANDOM-EFFECTS MODEL AND MIXED-EFFECTS EFFECTS MODEL Dr Shalabh Department of Mathematcs and Statstcs Indan

More information

Topic 23 - Randomized Complete Block Designs (RCBD)

Topic 23 - Randomized Complete Block Designs (RCBD) Topc 3 ANOVA (III) 3-1 Topc 3 - Randomzed Complete Block Desgns (RCBD) Defn: A Randomzed Complete Block Desgn s a varant of the completely randomzed desgn (CRD) that we recently learned. In ths desgn,

More information

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number

More information

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise. Chapter - The Smple Lnear Regresson Model The lnear regresson equaton s: where y + = β + β e for =,..., y and are observable varables e s a random error How can an estmaton rule be constructed for the

More information

Chapter 12 Analysis of Covariance

Chapter 12 Analysis of Covariance Chapter Analyss of Covarance Any scentfc experment s performed to know somethng that s unknown about a group of treatments and to test certan hypothess about the correspondng treatment effect When varablty

More information

NANYANG TECHNOLOGICAL UNIVERSITY SEMESTER I EXAMINATION MTH352/MH3510 Regression Analysis

NANYANG TECHNOLOGICAL UNIVERSITY SEMESTER I EXAMINATION MTH352/MH3510 Regression Analysis NANYANG TECHNOLOGICAL UNIVERSITY SEMESTER I EXAMINATION 014-015 MTH35/MH3510 Regresson Analyss December 014 TIME ALLOWED: HOURS INSTRUCTIONS TO CANDIDATES 1. Ths examnaton paper contans FOUR (4) questons

More information

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1 Random varables Measure of central tendences and varablty (means and varances) Jont densty functons and ndependence Measures of assocaton (covarance and correlaton) Interestng result Condtonal dstrbutons

More information

Testing for seasonal unit roots in heterogeneous panels

Testing for seasonal unit roots in heterogeneous panels Testng for seasonal unt roots n heterogeneous panels Jesus Otero * Facultad de Economía Unversdad del Rosaro, Colomba Jeremy Smth Department of Economcs Unversty of arwck Monca Gulett Aston Busness School

More information

Learning Objectives for Chapter 11

Learning Objectives for Chapter 11 Chapter : Lnear Regresson and Correlaton Methods Hldebrand, Ott and Gray Basc Statstcal Ideas for Managers Second Edton Learnng Objectves for Chapter Usng the scatterplot n regresson analyss Usng the method

More information

( )( ) [ ] [ ] ( ) 1 = [ ] = ( ) 1. H = X X X X is called the hat matrix ( it puts the hats on the Y s) and is of order n n H = X X X X.

( )( ) [ ] [ ] ( ) 1 = [ ] = ( ) 1. H = X X X X is called the hat matrix ( it puts the hats on the Y s) and is of order n n H = X X X X. ( ) ( ) where ( ) 1 ˆ β = X X X X β + ε = β + Aε A = X X 1 X [ ] E ˆ β β AE ε β so ˆ = + = β s unbased ( )( ) [ ] ˆ Cov β = E ˆ β β ˆ β β = E Aεε A AE ε ε A Aσ IA = σ AA = σ X X = [ ] = ( ) 1 Ftted values

More information

STAT 405 BIOSTATISTICS (Fall 2016) Handout 15 Introduction to Logistic Regression

STAT 405 BIOSTATISTICS (Fall 2016) Handout 15 Introduction to Logistic Regression STAT 45 BIOSTATISTICS (Fall 26) Handout 5 Introducton to Logstc Regresson Ths handout covers materal found n Secton 3.7 of your text. You may also want to revew regresson technques n Chapter. In ths handout,

More information

Interval Estimation in the Classical Normal Linear Regression Model. 1. Introduction

Interval Estimation in the Classical Normal Linear Regression Model. 1. Introduction ECONOMICS 35* -- NOTE 7 ECON 35* -- NOTE 7 Interval Estmaton n the Classcal Normal Lnear Regresson Model Ths note outlnes the basc elements of nterval estmaton n the Classcal Normal Lnear Regresson Model

More information

Chapter 3 Describing Data Using Numerical Measures

Chapter 3 Describing Data Using Numerical Measures Chapter 3 Student Lecture Notes 3-1 Chapter 3 Descrbng Data Usng Numercal Measures Fall 2006 Fundamentals of Busness Statstcs 1 Chapter Goals To establsh the usefulness of summary measures of data. The

More information

Introduction to Generalized Linear Models

Introduction to Generalized Linear Models INTRODUCTION TO STATISTICAL MODELLING TRINITY 00 Introducton to Generalzed Lnear Models I. Motvaton In ths lecture we extend the deas of lnear regresson to the more general dea of a generalzed lnear model

More information

Feature Selection: Part 1

Feature Selection: Part 1 CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?

More information

ISQS 6348 Final Open notes, no books. Points out of 100 in parentheses. Y 1 ε 2

ISQS 6348 Final Open notes, no books. Points out of 100 in parentheses. Y 1 ε 2 ISQS 6348 Fnal Open notes, no books. Ponts out of 100 n parentheses. 1. The followng path dagram s gven: ε 1 Y 1 ε F Y 1.A. (10) Wrte down the usual model and assumptons that are mpled by ths dagram. Soluton:

More information

Midterm Examination. Regression and Forecasting Models

Midterm Examination. Regression and Forecasting Models IOMS Department Regresson and Forecastng Models Professor Wllam Greene Phone: 22.998.0876 Offce: KMC 7-90 Home page: people.stern.nyu.edu/wgreene Emal: wgreene@stern.nyu.edu Course web page: people.stern.nyu.edu/wgreene/regresson/outlne.htm

More information

Lecture 6 More on Complete Randomized Block Design (RBD)

Lecture 6 More on Complete Randomized Block Design (RBD) Lecture 6 More on Complete Randomzed Block Desgn (RBD) Multple test Multple test The multple comparsons or multple testng problem occurs when one consders a set of statstcal nferences smultaneously. For

More information

Answers Problem Set 2 Chem 314A Williamsen Spring 2000

Answers Problem Set 2 Chem 314A Williamsen Spring 2000 Answers Problem Set Chem 314A Wllamsen Sprng 000 1) Gve me the followng crtcal values from the statstcal tables. a) z-statstc,-sded test, 99.7% confdence lmt ±3 b) t-statstc (Case I), 1-sded test, 95%

More information

Unit 10: Simple Linear Regression and Correlation

Unit 10: Simple Linear Regression and Correlation Unt 10: Smple Lnear Regresson and Correlaton Statstcs 571: Statstcal Methods Ramón V. León 6/28/2004 Unt 10 - Stat 571 - Ramón V. León 1 Introductory Remarks Regresson analyss s a method for studyng the

More information

The SAS program I used to obtain the analyses for my answers is given below.

The SAS program I used to obtain the analyses for my answers is given below. Homework 1 Answer sheet Page 1 The SAS program I used to obtan the analyses for my answers s gven below. dm'log;clear;output;clear'; *************************************************************; *** EXST7034

More information

Linear Regression Analysis: Terminology and Notation

Linear Regression Analysis: Terminology and Notation ECON 35* -- Secton : Basc Concepts of Regresson Analyss (Page ) Lnear Regresson Analyss: Termnology and Notaton Consder the generc verson of the smple (two-varable) lnear regresson model. It s represented

More information

The Ordinary Least Squares (OLS) Estimator

The Ordinary Least Squares (OLS) Estimator The Ordnary Least Squares (OLS) Estmator 1 Regresson Analyss Regresson Analyss: a statstcal technque for nvestgatng and modelng the relatonshp between varables. Applcatons: Engneerng, the physcal and chemcal

More information

x yi In chapter 14, we want to perform inference (i.e. calculate confidence intervals and perform tests of significance) in this setting.

x yi In chapter 14, we want to perform inference (i.e. calculate confidence intervals and perform tests of significance) in this setting. The Practce of Statstcs, nd ed. Chapter 14 Inference for Regresson Introducton In chapter 3 we used a least-squares regresson lne (LSRL) to represent a lnear relatonshp etween two quanttatve explanator

More information

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:

More information

Errors for Linear Systems

Errors for Linear Systems Errors for Lnear Systems When we solve a lnear system Ax b we often do not know A and b exactly, but have only approxmatons  and ˆb avalable. Then the best thng we can do s to solve ˆx ˆb exactly whch

More information

Chapter 15 - Multiple Regression

Chapter 15 - Multiple Regression Chapter - Multple Regresson Chapter - Multple Regresson Multple Regresson Model The equaton that descrbes how the dependent varable y s related to the ndependent varables x, x,... x p and an error term

More information