Stat 502 Design and Analysis of Experiments General Linear Model

Size: px
Start display at page:

Download "Stat 502 Design and Analysis of Experiments General Linear Model"

Transcription

1 1 Stat 502 Design and Analysis of Experiments General Linear Model Fritz Scholz Department of Statistics, University of Washington December 6, 2013

2 2 General Linear Hypothesis We assume the data vector Y = (Y 1,..., Y N ) consists of independent Y i N (µ i, σ 2 ), for i = 1,..., N. We also have a data model hypothesis, namely that the mean vector µ can be any point in a given s-dimensional linear subspace Π Ω R N, where s < N. Many statistical problems can be formulated as follows: we identify a linear subspace Π ω of Π Ω of dimension s r, with 0 < r s, and we test the hypothesis H 0 : µ Π ω against the alternative H 1 : µ Π Ω Π ω. To distinguish Π ω from Π Ω we call Π ω the test hypothesis.

3 3 The Two-Sample Example Let Y 1,..., Y m N (ξ, σ 2 ) and Y m+1,..., Y m+n N (η, σ 2 ) be independent samples. Here N = m + n and the data model below species or reects two samples with common variance σ 2 and possibly dierent means ξ and η. Π Ω is 2-dimensional (s = 2) consisting of all vectors of form µ = (µ 1,..., µ N ) m 1's n 0's m 0's n 1's {}}{{}}{{}}{{}}{ = ξ ( 1,..., 1, 0,..., 0) +η ( 0,..., 0, 1,..., 1) = ξa1 + ηa2 }{{}}{{} a 1 The orthogonal vectors a1 and a2 span Π Ω. Here we want to test the hypothesis H 0 : ξ = η, i.e., µ = (µ 1,..., µ N ) = ξ(a1 + a2) and Π ω is the linear subspace N 1's {}}{ spanned by 1 = a1 + a2 = ( 1,..., 1), i.e., r = 1 or s r = 1. a 2

4 4 The k-sample Example Let Y i1,..., Y ini N (ξ i, σ 2 ), i = 1,..., k be independent random samples. Here N = n n k and we have used the traditional double indexing on the Y 's, but we could equally well have used a more awkward single index i = 1,..., N. The data model below species or reects k samples with common variance σ 2 and possibly dierent means ξ 1,..., ξ k. Π Ω is k-dimensional (s = k) consisting of all vectors of form µ = (µ µ knk ) n 1 1's (N n 1) 0's (N n k ) 0's n k 1's {}}{{}}{{}}{{}}{ = ξ 1 ( 1,..., 1, 0,..., 0 ) ξ k ( 0,..., 0, 1,..., 1) }{{}}{{} a 1 a k = ξ 1 a ξ k a k The orthogonal vectors a1,..., a k span Π Ω. a i has 1's in positions (i, 1),..., (i, n i ) and 0's in the remaining positions.

5 5 H 0 : ξ 1 =... = ξ k Here we want to test the hypothesis H 0 : ξ 1 =... = ξ k, i.e., µ = (µ µ knk ) = ξ 1 a ξ k a k = ξ 1 (a a k ) Π ω is the (s r) dimensional linear subspace spanned by N 1's {}}{ 1 = a a k = ( 1,..., 1), i.e., s r = k r = 1, or r = k 1.

6 6 Least Squares Estimates (LSE's) The Π Ω -least squares estimate ˆµ = ˆµ(Y ) of µ minimizes Y µ 2 = N (Y i µ i ) 2 over µ Π Ω. ˆµ = ˆµ(Y ) = projection of Y onto Π Ω = point in Π Ω closest to Y The Π ω -least squares estimate ˆµ = ˆµ(Y ) of µ minimizes Y µ 2 = N (Y i µ i ) 2 over µ Π ω. ˆµ = ˆµ(Y ) = projection of Y onto Π ω = point in Π ω closest to Y

7 7 General Linear Model Schematic Diagram Y R N Y µ = e 0 µ µ Π Ω Π ω

8 8 Comments on the Previous Diagram ˆµ Π Ω is the orthogonal projection of the data vector Y onto Π Ω, i.e., best explanation of Y in terms of Π Ω. The orthogonal complement e = Y ˆµ is the residual error vector, i.e., that part of Y not explained by ˆµ Π Ω. Y = ˆµ + (Y ˆµ) = ˆµ + e. ˆµ is the orthogonal projection of the data vector Y onto Π ω, i.e., it is the best explanation of Y in terms of Π ω. ˆµ is also the orthogonal projection of ˆµ onto Π ω, i.e., it is the best explanation of ˆµ in terms of Π ω. ˆµ = ˆµ + (ˆµ ˆµ). The orthogonal complement ˆµ ˆµ is that part of ˆµ that cannot be explained by Π ω. It expresses the estimated discrepancy of the unknown µ from Π ω.

9 9 Orthogonal Least Squares Decomposition Y = orthogonal sum of the following three component vectors Y = ˆµ + (ˆµ ˆµ) + (Y ˆµ) = ˆµ + (ˆµ ˆµ) + e Orthogonality = these three components are independent of each other, assuming normality of Y. Pythagoras 4 Y 2 = ˆµ 2 + Y ˆµ 2 = ˆµ 2 + e 2 Y 2 = ˆµ 2 + Y ˆµ 2 ˆµ 2 = ˆµ 2 + ˆµ ˆµ 2 Y ˆµ 2 = ˆµ ˆµ 2 + Y ˆµ 2 = ˆµ ˆµ 2 + e 2 Y ˆµ 2 {}}{ Y 2 = ˆµ 2 + ˆµ ˆµ 2 + e 2 double Pythagoras

10 10 LSE's = MLE's (Maximum Likelihood Estimates Under the normal distribution model for the Y i the LSE's are also the maximum likelihood estimates (MLE's) of µ w.r.t. to the respective model constraints µ Π Ω and µ Π ω. This follows immediately from the likelihood function for the observed Y = y = (y 1,..., y N ) ( ) ( N ) N 1 L(µ, σ) = fµ,σ(y 1,..., y N ) = σ exp (y i µ i ) 2 2π 2σ 2 It is maximized over Π Ω (Π ω ) by minimizing N (y i µ i ) 2 w.r.t. µ Π Ω (Π ω ) taking ˆσ 2 = N (y i ˆµ i ) 2 /N and ˆσ 2 = N (y i ˆµ i ) 2 /N, respectively.

11 General Linear Model Theorem (Proof: Slides 36) Theorem: Under the general linear model assumption we have: ˆµ ˆµ 2 = = N ( ˆµ i ˆµ i ) 2 = Y ˆµ 2 Y ˆµ 2 N (Y i ˆµ i ) 2 N (Y i ˆµ i ) 2 σ 2 χ 2 r,λ N is independent of SSE = Y ˆµ 2 = (Y i ˆµ i ) 2 σ 2 χ 2 N s and thus F = ˆµ ˆµ 2 /r Y ˆµ 2 /(N s) F r, N s,λ, 11 noncentrality parameter = λ = ˆµ(µ) ˆµ(µ) 2 /σ 2 = ˆµ(µ) µ 2 /σ 2, where ˆµ(µ) is viewed as Π ω -LSE when Y = µ and ˆµ(µ) is viewed as Π Ω -LSE when Y = µ. Of course, ˆµ(µ) = µ since µ Π Ω.

12 General Linear Model Theorem (2-Sample Case) Here the Π Ω -LSE of µ minimizes N m (Y i µ i ) 2 = (Y i ξ) 2 + m+n i=m+1 (Y i η) 2 12 = ˆµ = (ˆξ,..., ˆξ, ˆη,..., ˆη) = ˆξ a 1 + ˆηa 2 m m+n with ˆξ = Ȳ 1 = Y i /m and ˆη = Y i /n. i=m+1 The Π ω -LSE of µ minimizes N N (Y i µ i ) 2 = (Y i µ) 2 ˆµ = (ˆµ,..., ˆµ) = ˆµ 1 with ˆµ = Ȳ F = Ȳ = (mȳ1 + nȳ2)/n ˆµ ˆµ 2 = mn N (Ȳ1 Ȳ2) 2 ( ˆµ ˆµ 2 /r Y ˆµ 2 /(N s) = [Ȳ1 Ȳ2]/ 1/m + 1/n) 2 /1 m+n (Y i Ȳ ) 2 /(m + n 2) F 1, N 2, λ = (2-sample t-statistic) 2. F -test 2-sided t-test.

13 13 Noncentrality Parameter in the 2-Sample Case To nd the noncentrality parameter λ we replace Y by µ = (ξ,..., ξ, η,..., η) in ˆµ(Y ) ˆµ(Y ) 2 = mn N (Ȳ1 Ȳ2) 2 = ˆµ(µ) ˆµ(µ) 2 = mn N (ξ η)2 = Thus the noncentrality parameter (ncp) is λ = ( ( [ξ η]/ ) 2 1/m + 1/n σ 2 = δ 2 where δ is the ncp in the two-sample t-test, namely δ = ξ η σ 1/m + 1/n. ) 2 (ξ η) 1/m + 1/n

14 14 General Linear Model Theorem (k-sample Case) Here the Π Ω -LSE of µ minimizes N (Y i µ i ) 2 = k n i (Y ij ξ i ) 2 j=1 ˆµ = ˆξ 1 a ˆξ k a k with ˆξ i = Ȳi. = The Π ω -LSE of µ minimizes N (Y i µ i ) 2 = F = ˆµ = Ȳ.. = n i Y ij /n i. j=1 k n i (Y ij µ) 2 ˆµ = (ˆµ,..., ˆµ) = ˆµ 1 j=1 k Ȳ i.n i /N ˆµ ˆµ 2 = ˆµ ˆµ 2 k /r Y ˆµ 2 /(N s) = k ni k n i j=1 (Ȳi. Ȳ..) 2 j=1 (Ȳ i. Ȳ..)2 /(k 1) ni j=1 (Y ij Ȳ..) 2 /(N k) F k 1, N k, λ

15 15 Noncentrality Parameter in the k-sample Case To get the noncentrality parameter λ we replace Y by µ = (ξ 1,..., ξ 1,..., ξ k,..., ξ k ) in ˆµ(Y ) ˆµ(Y ) 2 = k n i j=1 (Ȳ i. Ȳ..) 2 = ˆµ(µ) ˆµ(µ) 2 = k n i (ξ i ξ) 2 = j=1 k n i (ξ i ξ) 2 with ξ = k n i µ ij /N = j=1 k ξ i n i /N The noncentrality parameter is λ = k n i(ξ i ξ) 2 σ 2.

16 16 The Two Factor Experiment with Unequal Cell Counts We consider the following model with two factors A and B Y ijk = µ ij + e ijk = µ + a i + b j + c ij + e ijk with e ijk i.i.d. N (0, σ 2 ), i = 1,..., t 1, j = 1,..., t 2, k = 1,..., n ij. The replications n ij may vary from cell to cell. Even if we plan n ij = n i, j, accidents happen, lost data. Sometimes imbalance is planned (economic constraints). May even have n ij = 0 for some cells. It constrains analysis. We require n i. = j n ij > 0 i, and similarly n.j > 0 j.

17 17 Parametrization Equivalence: n ij > 0 i, j To establish a 1-1 correspondence between the t 1 t 2 freely varying cell means µ ij and the model parameters µ, a i, b j, c ij, we impose the set-to-zero constraints on the latter, i.e, a 1 = 0, b 1 = 0 and c 1j = c i1 = 0 for all i, j. This leaves 1 + (t 1 1) + (t 2 1) + (t 1 1) (t 2 1) = t 1 t 2 freely varying model parameters. Clearly the t 1 t 2 freely varying model parameters, µ, a i, b j, c ij with set-to-zero side conditions give us µ ij = µ + a i + b j + c ij. Conversely, we get the model parameters µ, a i, b j, c ij with set-to-zero side conditions from µ ij via µ = µ 11, a i = µ i1 µ 11, b j = µ 1j µ 11, c ij = µ ij µ 11 (µ i1 µ 11 ) + (µ 1j µ 11 ) = µ ij µ i1 µ 1j + µ 11 again with µ ij = µ+a i +b j +c ij = µ 11 +(µ i1 µ 11 )+(µ 1j µ 11 )+(µ ij µ i1 µ 1j +µ 11 )

18 18 Problems Caused by Unequal Replications n ij = n i, j orthogonality nice ANOVA decomposition. When n ij vary, that decomposition usually does not hold, i.e., we still have Y ijk = Ȳ... + (Ȳi.. Ȳ...) + (Ȳ.j. Ȳ...) +(Ȳ ij. Ȳ i.. Ȳ.j. + Ȳ...) + (Y ijk Ȳ ij.) but typically SS T SS A + SS B + SS AB + SS E The orthogonality no longer holds! However, we can still test various hypotheses. H 0 : c ij = 0 i, j (no interactions) or the additive model holds. H A 0 : a i = 0 i, given no interactions (no main eect from A). H B 0 : b j = 0 j, given no interactions (no main eect from B). The denominator error sum of squares and df will be dierent when testing H 0 or H A 0 (HB 0 ).

19 19 The Order of Factors When both eects are tested in the additive model, mind the order in which the factors are specied in the model. In lm(y ~ A+B) F A tests H0 A within the additive model. In the latter situation F B tests factor B, after accounting for A. The reason for the distinction is that the subspaces corresponding to A and B are not orthogonal. Part of B may be explainable by A. Only the part of B orthogonal to the subspace corresponding to A comes into play when B is specied as the second factor. The sums of squares still add up. They are least squares distances of projections onto subspaces.

20 20 Illustration: With Interactions, n ij > 0 i, j # for factorial.test see class web page > nmat <- cbind(c(2,1,2),c(2,3,2),c(2,3,1)) > out <- factorial.test(nmat=nmat,sig=1.5,int=t) > out Y a b

21 21 ANOVAs for Illustration Data > anova(lm(y~a+b+a*b,data=out)) Analysis of Variance Table Response: Y Df Sum Sq Mean Sq F value Pr(>F) a * b ** a:b * Residuals Signif. codes: 0 *** ** 0.01 * > anova(lm(y~b+a+a*b,data=out)) Analysis of Variance Table Response: Y Df Sum Sq Mean Sq F value Pr(>F) b ** a ** b:a * Residuals Signif. codes: 0 *** ** 0.01 *

22 22 Some Comments Note the order of a+b and b+a in previous models. The results for those two factors are dierent. This is due to order of subspace projections. Sums of squares = squared distance within the respective eect subspace to the hypothesis that the eect is zero. The residual sum of squares measures the variation distance perpendicular to involved eects.

23 23 Illustration: No Interactions, n ij > 0 i, j > nmat <- cbind(c(2,1,2),c(2,3,2),c(2,3,1)) > out<- factorial.test(nmat=nmat,sig=.5,int=f) > out Y a b

24 24 ANOVAs for Illustration Data > anova(lm(y~a+b+a*b,data=out)) Analysis of Variance Table Response: Y Df Sum Sq Mean Sq F value Pr(>F) a * b * a:b Residuals Signif. codes: 0 *** ** 0.01 * > anova(lm(y~b+a+a*b,data=out)) Analysis of Variance Table Response: Y Df Sum Sq Mean Sq F value Pr(>F) b * a ** b:a Residuals Signif. codes: 0 *** ** 0.01 *

25 25 Illustration: With Interactions, Some n ij = 0 > nmat <- cbind(c(0, 1, 2), c(2, 3, 2), c(2, 3, 0)) > out<- factorial.test(nmat=nmat,sig=.5,int=t) > out Y a b

26 26 ANOVAs for Illustration Data > anova(lm(y~a+b+a*b,data=out)) Analysis of Variance Table Response: Y Df Sum Sq Mean Sq F value Pr(>F) a *** b ** a:b Residuals Signif. codes: 0 *** ** 0.01 * > anova(lm(y~b+a+a*b,data=out)) Analysis of Variance Table Response: Y Df Sum Sq Mean Sq F value Pr(>F) b *** a *** b:a Residuals Signif. codes: 0 *** ** 0.01 *

27 27 ANOVAs for Illustration Data > anova(lm(y~a+b,data=out)) Analysis of Variance Table Response: Y Df Sum Sq Mean Sq F value Pr(>F) a *** b ** Residuals > anova(lm(y~a:b,data=out)) Analysis of Variance Table Response: Y Df Sum Sq Mean Sq F value Pr(>F) a:b *** Residuals Note = = a:b Sum Sq i,j k:n ij >0 (Y ijk Ȳ ij.) 2 = from previous slide.

28 28 Comments on Previous ANOVAs b:a only has 2 degrees of freedom, in place of previous 4. In the full model we are missing 2 µ ij 's. This works as long as the pattern of non-missing cells is connected, i.e., any µ ij with n ij > 0 can be connected by a rectangular path to any other µ i j with n i j > 0. The rectangular path can only traverse cells with n ij > 0. Chapter 5 of Linear Models for Unbalanced Data by Shayle R. Searle, Wiley 1987.

29 29 Analysis of Covariance (ANCOVA) The simplest example model is as follows Y ij = µ + τ i + β(x ij x..) + ɛ ij with ɛ ij i.i.d. N (0, σ 2 ) We have a treatments, i = 1,..., a. n observations Y ij at level i, with covariates x ij. τ i satisfy sum- or set-to-zero side conditions. If we ignore x ij, then the response variability may be inated. The x ij may suggest (false) treatment eects. The above model is an ANOVA/Regressions hybrid. Lots of generalizations: More treatments More covariates Dierent slopes, i.e., covariate/treatment interactions Treatment interactions ANCOVA is useful for variation adjustment due to uncontrollable nuisance variables. Contrast with blocking.

30 30 Breaking Strength Example Data (Montgomery) The following data were taken from Montgomery Ch They represent breaking strength data for monolament ber produced by 3 dierent machines M i, i = 1, 2, 3. The strength Y ij is measured in pounds. Y ij is aected by the ber diameter x ij. (x ij, Y ij ), j = 1,..., 5 were taken from machine M i.

31 31 The Data > breakstrength strength diameter machine

32 32 A First Look Breaking strength, y Machine 1 Machine 2 Machine Diameter, x

33 33 ANOVA of Diameter vs Machine > anova(lm(diameter~machine,data=breakstrength)) Analysis of Variance Table Response: diameter Df Sum Sq Mean Sq F value Pr(>F) machine Residuals When using ANCOVA it is useful to understand the eect of the covariate (diameter) on the factor (Machine). When adjusting for covariate rst, we may adjust also for part of the factor. Here Machine seems not to aect the covariate.

34 34 ANCOVA > anova(lm(strength~machine+diameter,data=breakstrength) Analysis of Variance Table Response: strength Df Sum Sq Mean Sq F value Pr(>F) machine e-05 *** diameter e-06 *** Residuals > anova(lm(strength~diameter+machine,data=breakstrength) Analysis of Variance Table Response: strength Df Sum Sq Mean Sq F value Pr(>F) diameter e-07 *** machine Residuals Note the dierence in signicance!

35 35 Diagnostic Plots residuals fitted diameter Normal Q Q Plot Sample Quantiles Machine Theoretical Quantiles

36 36 Appendix A: General Linear Model Theorem Proof The proof is essentially that in Testing Statistical Hypotheses, 3rd Edition, ch. 7, by E.L. Lehmann and J.P. Romano (2005) There it is presented in the context of certain optimality properties and followed by many explicit examples. The proof is rst given in a case of special linear subspaces Π Ω and Π ω Π Ω, where the statement of the theorem is immediate from the denitions of χ 2 f, χ2 g,λ and F f,g,λ. The general case can always be orthonormally transformed to the special case. This does not change the meaning of LSE. LSE's in one framework rotate to LSE's in the other. Orthonormal transforms don't change distances/independence.

37 37 General Linear Model Theorem: Special Case U i N (ν i, σ 2 ), i = 1,..., N with model hypothesis ν s+1 =... = ν N = 0. This describes an s-dimensional linear subspace Π Ω of R N. Test H 0 : ν 1 =... = ν r = 0 (inside the model hypothesis). This describes an (s r)-dimensional subspace Π ω Π Ω. = N i=s+1 U 2 i /σ 2 χ 2 N s and r Ui 2 /σ 2 χ 2 r, λ are independent and with ncp λ = r ν2 i /σ2 The natural test rejects H 0 when the corresponding F -statistic is too large, where r F = U2 i /r N i=s+1 U2 i /(N s) F r, N s,λ.

38 38 LSE View in Special Case The Π Ω -LSE ˆν minimizes N U ν 2 = (U i ν i ) 2 = s (U i ν i ) 2 + N i=s+1 U 2 i over ν Π Ω i.e., ˆν i = U i for i = 1,..., s and ˆν i = 0 for i = s + 1,..., N. Similarly, the Π ω -LSE ˆν minimizes N r s N U ν 2 = (U i ν i ) 2 = Ui 2 + (U i ν i ) 2 + over ν Π ω i=r+1 i=s+1 i.e., ˆν i = U i for i = r + 1,..., s and ˆν i = 0 for i = 1,..., r, s + 1,..., N. Clearly N U ˆν 2 = Ui 2 and ˆν ˆν 2 = U ˆν 2 U ˆν 2 = i=s+1 F = ˆν ˆν 2 /r U ˆν 2 /(N s) = r U2 i /r N i=s+1 U2 i U 2 i /(N s) r U 2 i

39 LSE View of the Noncentrality Parameter Using U = ν Π Ω in the Π ω -LSE derivation of ν ω we have N U ν ω 2 = ν ν ω 2 = (ν i ν ω,i ) 2 = r s N νi 2 + (ν i ν ω,i ) 2 + i=r+1 i=s+1 ν 2 i since ν ω Π ω Π Ω = r s νi 2 + (ν i ν ω,i ) 2 since ν Π Ω i=r+1 39 = r νi 2 after minimizing over Π ω, i.e., ν ω,i = ν i, i = r + 1,..., = λ = ν ν ω 2 /σ 2 = r νi 2 /σ 2.

40 40 Orthonormal Transform to the Special Case The general case can always be orthonormally transformed to the special case. Let C be an orthonormal matrix, with the rst s rows c i, i = 1,..., s, spanning Π Ω and c i, i = r + 1,..., s, spanning Π ω. Such orthonormal basis vectors can always be constructed via the Gram-Schmidt process. Transform U = C Y with mean vector ν = Cµ U 1,..., U N are again independent with common variance σ 2. Now note that µ Π Ω µ c i or ν i = c iµ = 0, i = s + 1,..., N ν Π Ω µ Π ω µ c i or ν i = c iµ = 0, i = 1,..., r, s+1,..., N ν Π ω

41 41 Orthonormal Transform and Independence Suppose V 1,..., V N are i.i.d. N (0, 1) then Z = C V has components Z 1,..., Z n i.i.d. N (0, 1) ( ) ( ) N 1 N ( ) N f (v) = exp vi 2 1 /2 = exp ( v 2 /2 ) 2π 2π has constant density if and only if v is constant, i.e., the constant level density contour surfaces consist of points that are equidistant from the origin. Orthonormal transformations z = C v preserve distances ( v = z ) The transformed vector Z has the same density, i.e., f (z). The general independence result may be seen as follows Y = µ + σv has independent components Y 1,..., Y N. U = C Y = Cµ+σC V = ν+σz has independent components

42 42 Orthonormal Transform and LSE's The principle of LSE's is based on minimizing distances. Orthonormal transforms don't change distances. b 2 = b b = a C C a = a a = a 2. Thus the LSE's (w.r.t. Π Ω or Π ω ) based on the untransformed Y simply transform to the LSE's w.r.t. to the transformed Π Ω or Π ω and based on the transformed U. Numerator and denominator of F involve LS distances, which are unchanged as we pass from Y to U = C Y. The same holds for the ncp, also characterized in terms of a LS distance/σ 2, i.e., ncp = ˆµ(µ) µ 2 /σ 2 = ˆν(ν) ν 2 /σ 2, based on Y = µ or U = ν, respectively.

22s:152 Applied Linear Regression. Take random samples from each of m populations.

22s:152 Applied Linear Regression. Take random samples from each of m populations. 22s:152 Applied Linear Regression Chapter 8: ANOVA NOTE: We will meet in the lab on Monday October 10. One-way ANOVA Focuses on testing for differences among group means. Take random samples from each

More information

Practice Final Exam. December 14, 2009

Practice Final Exam. December 14, 2009 Practice Final Exam December 14, 29 1 New Material 1.1 ANOVA 1. A purication process for a chemical involves passing it, in solution, through a resin on which impurities are adsorbed. A chemical engineer

More information

Stat 502 Design and Analysis of Experiments Large Factorial Designs

Stat 502 Design and Analysis of Experiments Large Factorial Designs 1 Stat 502 Design and Analysis of Experiments Large Factorial Designs Fritz Scholz Department of Statistics, University of Washington December 3, 2013 2 Exploration of Many Factors The typical setting

More information

Chapter 8: Hypothesis Testing Lecture 9: Likelihood ratio tests

Chapter 8: Hypothesis Testing Lecture 9: Likelihood ratio tests Chapter 8: Hypothesis Testing Lecture 9: Likelihood ratio tests Throughout this chapter we consider a sample X taken from a population indexed by θ Θ R k. Instead of estimating the unknown parameter, we

More information

22s:152 Applied Linear Regression. There are a couple commonly used models for a one-way ANOVA with m groups. Chapter 8: ANOVA

22s:152 Applied Linear Regression. There are a couple commonly used models for a one-way ANOVA with m groups. Chapter 8: ANOVA 22s:152 Applied Linear Regression Chapter 8: ANOVA NOTE: We will meet in the lab on Monday October 10. One-way ANOVA Focuses on testing for differences among group means. Take random samples from each

More information

Summary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1)

Summary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1) Summary of Chapter 7 (Sections 7.2-7.5) and Chapter 8 (Section 8.1) Chapter 7. Tests of Statistical Hypotheses 7.2. Tests about One Mean (1) Test about One Mean Case 1: σ is known. Assume that X N(µ, σ

More information

[y i α βx i ] 2 (2) Q = i=1

[y i α βx i ] 2 (2) Q = i=1 Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation

More information

Lecture 15. Hypothesis testing in the linear model

Lecture 15. Hypothesis testing in the linear model 14. Lecture 15. Hypothesis testing in the linear model Lecture 15. Hypothesis testing in the linear model 1 (1 1) Preliminary lemma 15. Hypothesis testing in the linear model 15.1. Preliminary lemma Lemma

More information

Ch 2: Simple Linear Regression

Ch 2: Simple Linear Regression Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component

More information

Analysis of variance. Gilles Guillot. September 30, Gilles Guillot September 30, / 29

Analysis of variance. Gilles Guillot. September 30, Gilles Guillot September 30, / 29 Analysis of variance Gilles Guillot gigu@dtu.dk September 30, 2013 Gilles Guillot (gigu@dtu.dk) September 30, 2013 1 / 29 1 Introductory example 2 One-way ANOVA 3 Two-way ANOVA 4 Two-way ANOVA with interactions

More information

I i=1 1 I(J 1) j=1 (Y ij Ȳi ) 2. j=1 (Y j Ȳ )2 ] = 2n( is the two-sample t-test statistic.

I i=1 1 I(J 1) j=1 (Y ij Ȳi ) 2. j=1 (Y j Ȳ )2 ] = 2n( is the two-sample t-test statistic. Serik Sagitov, Chalmers and GU, February, 08 Solutions chapter Matlab commands: x = data matrix boxplot(x) anova(x) anova(x) Problem.3 Consider one-way ANOVA test statistic For I = and = n, put F = MS

More information

Lecture 10. Factorial experiments (2-way ANOVA etc)

Lecture 10. Factorial experiments (2-way ANOVA etc) Lecture 10. Factorial experiments (2-way ANOVA etc) Jesper Rydén Matematiska institutionen, Uppsala universitet jesper@math.uu.se Regression and Analysis of Variance autumn 2014 A factorial experiment

More information

Solutions to Final STAT 421, Fall 2008

Solutions to Final STAT 421, Fall 2008 Solutions to Final STAT 421, Fall 2008 Fritz Scholz 1. (8) Two treatments A and B were randomly assigned to 8 subjects (4 subjects to each treatment) with the following responses: 0, 1, 3, 6 and 5, 7,

More information

Outline. Statistical inference for linear mixed models. One-way ANOVA in matrix-vector form

Outline. Statistical inference for linear mixed models. One-way ANOVA in matrix-vector form Outline Statistical inference for linear mixed models Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark general form of linear mixed models examples of analyses using linear mixed

More information

Formal Statement of Simple Linear Regression Model

Formal Statement of Simple Linear Regression Model Formal Statement of Simple Linear Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters X i is a known constant, the value of the predictor

More information

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Put your solution to each problem on a separate sheet of paper. Problem 1. (5106) Let X 1, X 2,, X n be a sequence of i.i.d. observations from a

More information

A Note on UMPI F Tests

A Note on UMPI F Tests A Note on UMPI F Tests Ronald Christensen Professor of Statistics Department of Mathematics and Statistics University of New Mexico May 22, 2015 Abstract We examine the transformations necessary for establishing

More information

Linear models and their mathematical foundations: Simple linear regression

Linear models and their mathematical foundations: Simple linear regression Linear models and their mathematical foundations: Simple linear regression Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/21 Introduction

More information

Lec 1: An Introduction to ANOVA

Lec 1: An Introduction to ANOVA Ying Li Stockholm University October 31, 2011 Three end-aisle displays Which is the best? Design of the Experiment Identify the stores of the similar size and type. The displays are randomly assigned to

More information

Linear Models Review

Linear Models Review Linear Models Review Vectors in IR n will be written as ordered n-tuples which are understood to be column vectors, or n 1 matrices. A vector variable will be indicted with bold face, and the prime sign

More information

A Likelihood Ratio Test

A Likelihood Ratio Test A Likelihood Ratio Test David Allen University of Kentucky February 23, 2012 1 Introduction Earlier presentations gave a procedure for finding an estimate and its standard error of a single linear combination

More information

Unit 12: Analysis of Single Factor Experiments

Unit 12: Analysis of Single Factor Experiments Unit 12: Analysis of Single Factor Experiments Statistics 571: Statistical Methods Ramón V. León 7/16/2004 Unit 12 - Stat 571 - Ramón V. León 1 Introduction Chapter 8: How to compare two treatments. Chapter

More information

Multiple Predictor Variables: ANOVA

Multiple Predictor Variables: ANOVA Multiple Predictor Variables: ANOVA 1/32 Linear Models with Many Predictors Multiple regression has many predictors BUT - so did 1-way ANOVA if treatments had 2 levels What if there are multiple treatment

More information

Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2

Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2 Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2 Fall, 2013 Page 1 Random Variable and Probability Distribution Discrete random variable Y : Finite possible values {y

More information

Analysis of Variance

Analysis of Variance Analysis of Variance Blood coagulation time T avg A 62 60 63 59 61 B 63 67 71 64 65 66 66 C 68 66 71 67 68 68 68 D 56 62 60 61 63 64 63 59 61 64 Blood coagulation time A B C D Combined 56 57 58 59 60 61

More information

Inference for Regression

Inference for Regression Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu

More information

Diagnostics and Remedial Measures

Diagnostics and Remedial Measures Diagnostics and Remedial Measures Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Diagnostics and Remedial Measures 1 / 72 Remedial Measures How do we know that the regression

More information

STAT 350: Geometry of Least Squares

STAT 350: Geometry of Least Squares The Geometry of Least Squares Mathematical Basics Inner / dot product: a and b column vectors a b = a T b = a i b i a b a T b = 0 Matrix Product: A is r s B is s t (AB) rt = s A rs B st Partitioned Matrices

More information

DESAIN EKSPERIMEN BLOCKING FACTORS. Semester Genap 2017/2018 Jurusan Teknik Industri Universitas Brawijaya

DESAIN EKSPERIMEN BLOCKING FACTORS. Semester Genap 2017/2018 Jurusan Teknik Industri Universitas Brawijaya DESAIN EKSPERIMEN BLOCKING FACTORS Semester Genap Jurusan Teknik Industri Universitas Brawijaya Outline The Randomized Complete Block Design The Latin Square Design The Graeco-Latin Square Design Balanced

More information

Chapter 4 Experiments with Blocking Factors

Chapter 4 Experiments with Blocking Factors Chapter 4 Experiments with Blocking Factors 許湘伶 Design and Analysis of Experiments (Douglas C. Montgomery) hsuhl (NUK) DAE Chap. 4 1 / 54 The Randomized Complete Block Design (RCBD; 隨機化完全集區設計 ) 1 Variability

More information

The legacy of Sir Ronald A. Fisher. Fisher s three fundamental principles: local control, replication, and randomization.

The legacy of Sir Ronald A. Fisher. Fisher s three fundamental principles: local control, replication, and randomization. 1 Chapter 1: Research Design Principles The legacy of Sir Ronald A. Fisher. Fisher s three fundamental principles: local control, replication, and randomization. 2 Chapter 2: Completely Randomized Design

More information

20.1. Balanced One-Way Classification Cell means parametrization: ε 1. ε I. + ˆɛ 2 ij =

20.1. Balanced One-Way Classification Cell means parametrization: ε 1. ε I. + ˆɛ 2 ij = 20. ONE-WAY ANALYSIS OF VARIANCE 1 20.1. Balanced One-Way Classification Cell means parametrization: Y ij = µ i + ε ij, i = 1,..., I; j = 1,..., J, ε ij N(0, σ 2 ), In matrix form, Y = Xβ + ε, or 1 Y J

More information

Two-Way Factorial Designs

Two-Way Factorial Designs 81-86 Two-Way Factorial Designs Yibi Huang 81-86 Two-Way Factorial Designs Chapter 8A - 1 Problem 81 Sprouting Barley (p166 in Oehlert) Brewer s malt is produced from germinating barley, so brewers like

More information

STAT22200 Spring 2014 Chapter 8A

STAT22200 Spring 2014 Chapter 8A STAT22200 Spring 2014 Chapter 8A Yibi Huang May 13, 2014 81-86 Two-Way Factorial Designs Chapter 8A - 1 Problem 81 Sprouting Barley (p166 in Oehlert) Brewer s malt is produced from germinating barley,

More information

Ph.D. Qualifying Exam Friday Saturday, January 3 4, 2014

Ph.D. Qualifying Exam Friday Saturday, January 3 4, 2014 Ph.D. Qualifying Exam Friday Saturday, January 3 4, 2014 Put your solution to each problem on a separate sheet of paper. Problem 1. (5166) Assume that two random samples {x i } and {y i } are independently

More information

PART I. (a) Describe all the assumptions for a normal error regression model with one predictor variable,

PART I. (a) Describe all the assumptions for a normal error regression model with one predictor variable, Concordia University Department of Mathematics and Statistics Course Number Section Statistics 360/2 01 Examination Date Time Pages Final December 2002 3 hours 6 Instructors Course Examiner Marks Y.P.

More information

Math 423/533: The Main Theoretical Topics

Math 423/533: The Main Theoretical Topics Math 423/533: The Main Theoretical Topics Notation sample size n, data index i number of predictors, p (p = 2 for simple linear regression) y i : response for individual i x i = (x i1,..., x ip ) (1 p)

More information

More about Single Factor Experiments

More about Single Factor Experiments More about Single Factor Experiments 1 2 3 0 / 23 1 2 3 1 / 23 Parameter estimation Effect Model (1): Y ij = µ + A i + ɛ ij, Ji A i = 0 Estimation: µ + A i = y i. ˆµ = y..  i = y i. y.. Effect Modell

More information

Multiple Predictor Variables: ANOVA

Multiple Predictor Variables: ANOVA What if you manipulate two factors? Multiple Predictor Variables: ANOVA Block 1 Block 2 Block 3 Block 4 A B C D B C D A C D A B D A B C Randomized Controlled Blocked Design: Design where each treatment

More information

The Distribution of F

The Distribution of F The Distribution of F It can be shown that F = SS Treat/(t 1) SS E /(N t) F t 1,N t,λ a noncentral F-distribution with t 1 and N t degrees of freedom and noncentrality parameter λ = t i=1 n i(µ i µ) 2

More information

One-way ANOVA (Single-Factor CRD)

One-way ANOVA (Single-Factor CRD) One-way ANOVA (Single-Factor CRD) STAT:5201 Week 3: Lecture 3 1 / 23 One-way ANOVA We have already described a completed randomized design (CRD) where treatments are randomly assigned to EUs. There is

More information

Chapter 1. Linear Regression with One Predictor Variable

Chapter 1. Linear Regression with One Predictor Variable Chapter 1. Linear Regression with One Predictor Variable 1.1 Statistical Relation Between Two Variables To motivate statistical relationships, let us consider a mathematical relation between two mathematical

More information

Stat 579: Generalized Linear Models and Extensions

Stat 579: Generalized Linear Models and Extensions Stat 579: Generalized Linear Models and Extensions Mixed models Yan Lu March, 2018, week 8 1 / 32 Restricted Maximum Likelihood (REML) REML: uses a likelihood function calculated from the transformed set

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Maximum Likelihood Estimation Merlise Clyde STA721 Linear Models Duke University August 31, 2017 Outline Topics Likelihood Function Projections Maximum Likelihood Estimates Readings: Christensen Chapter

More information

18.S096 Problem Set 3 Fall 2013 Regression Analysis Due Date: 10/8/2013

18.S096 Problem Set 3 Fall 2013 Regression Analysis Due Date: 10/8/2013 18.S096 Problem Set 3 Fall 013 Regression Analysis Due Date: 10/8/013 he Projection( Hat ) Matrix and Case Influence/Leverage Recall the setup for a linear regression model y = Xβ + ɛ where y and ɛ are

More information

Unit 7: Random Effects, Subsampling, Nested and Crossed Factor Designs

Unit 7: Random Effects, Subsampling, Nested and Crossed Factor Designs Unit 7: Random Effects, Subsampling, Nested and Crossed Factor Designs STA 643: Advanced Experimental Design Derek S. Young 1 Learning Objectives Understand how to interpret a random effect Know the different

More information

ANOVA (Analysis of Variance) output RLS 11/20/2016

ANOVA (Analysis of Variance) output RLS 11/20/2016 ANOVA (Analysis of Variance) output RLS 11/20/2016 1. Analysis of Variance (ANOVA) The goal of ANOVA is to see if the variation in the data can explain enough to see if there are differences in the means.

More information

STAT 263/363: Experimental Design Winter 2016/17. Lecture 1 January 9. Why perform Design of Experiments (DOE)? There are at least two reasons:

STAT 263/363: Experimental Design Winter 2016/17. Lecture 1 January 9. Why perform Design of Experiments (DOE)? There are at least two reasons: STAT 263/363: Experimental Design Winter 206/7 Lecture January 9 Lecturer: Minyong Lee Scribe: Zachary del Rosario. Design of Experiments Why perform Design of Experiments (DOE)? There are at least two

More information

Lec 5: Factorial Experiment

Lec 5: Factorial Experiment November 21, 2011 Example Study of the battery life vs the factors temperatures and types of material. A: Types of material, 3 levels. B: Temperatures, 3 levels. Example Study of the battery life vs the

More information

Lecture 1: Linear Models and Applications

Lecture 1: Linear Models and Applications Lecture 1: Linear Models and Applications Claudia Czado TU München c (Claudia Czado, TU Munich) ZFS/IMS Göttingen 2004 0 Overview Introduction to linear models Exploratory data analysis (EDA) Estimation

More information

Two or more categorical predictors. 2.1 Two fixed effects

Two or more categorical predictors. 2.1 Two fixed effects Two or more categorical predictors Here we extend the ANOVA methods to handle multiple categorical predictors. The statistician has to watch carefully to see whether the effects being considered are properly

More information

STA 114: Statistics. Notes 21. Linear Regression

STA 114: Statistics. Notes 21. Linear Regression STA 114: Statistics Notes 1. Linear Regression Introduction A most widely used statistical analysis is regression, where one tries to explain a response variable Y by an explanatory variable X based on

More information

36-707: Regression Analysis Homework Solutions. Homework 3

36-707: Regression Analysis Homework Solutions. Homework 3 36-707: Regression Analysis Homework Solutions Homework 3 Fall 2012 Problem 1 Y i = βx i + ɛ i, i {1, 2,..., n}. (a) Find the LS estimator of β: RSS = Σ n i=1(y i βx i ) 2 RSS β = Σ n i=1( 2X i )(Y i βx

More information

K. Model Diagnostics. residuals ˆɛ ij = Y ij ˆµ i N = Y ij Ȳ i semi-studentized residuals ω ij = ˆɛ ij. studentized deleted residuals ɛ ij =

K. Model Diagnostics. residuals ˆɛ ij = Y ij ˆµ i N = Y ij Ȳ i semi-studentized residuals ω ij = ˆɛ ij. studentized deleted residuals ɛ ij = K. Model Diagnostics We ve already seen how to check model assumptions prior to fitting a one-way ANOVA. Diagnostics carried out after model fitting by using residuals are more informative for assessing

More information

Analysis of Covariance

Analysis of Covariance Analysis of Covariance (ANCOVA) Bruce A Craig Department of Statistics Purdue University STAT 514 Topic 10 1 When to Use ANCOVA In experiment, there is a nuisance factor x that is 1 Correlated with y 2

More information

Master s Written Examination

Master s Written Examination Master s Written Examination Option: Statistics and Probability Spring 05 Full points may be obtained for correct answers to eight questions Each numbered question (which may have several parts) is worth

More information

1-Way ANOVA MATH 143. Spring Department of Mathematics and Statistics Calvin College

1-Way ANOVA MATH 143. Spring Department of Mathematics and Statistics Calvin College 1-Way ANOVA MATH 143 Department of Mathematics and Statistics Calvin College Spring 2010 The basic ANOVA situation Two variables: 1 Categorical, 1 Quantitative Main Question: Do the (means of) the quantitative

More information

Unit 6: Orthogonal Designs Theory, Randomized Complete Block Designs, and Latin Squares

Unit 6: Orthogonal Designs Theory, Randomized Complete Block Designs, and Latin Squares Unit 6: Orthogonal Designs Theory, Randomized Complete Block Designs, and Latin Squares STA 643: Advanced Experimental Design Derek S. Young 1 Learning Objectives Understand the basics of orthogonal designs

More information

Chapter 4: Randomized Blocks and Latin Squares

Chapter 4: Randomized Blocks and Latin Squares Chapter 4: Randomized Blocks and Latin Squares 1 Design of Engineering Experiments The Blocking Principle Blocking and nuisance factors The randomized complete block design or the RCBD Extension of the

More information

Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression

Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression BSTT523: Kutner et al., Chapter 1 1 Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression Introduction: Functional relation between

More information

Course topics (tentative) The role of random effects

Course topics (tentative) The role of random effects Course topics (tentative) random effects linear mixed models analysis of variance frequentist likelihood-based inference (MLE and REML) prediction Bayesian inference The role of random effects Rasmus Waagepetersen

More information

Math 261 Lecture Notes: Sections 6.1, 6.2, 6.3 and 6.4 Orthogonal Sets and Projections

Math 261 Lecture Notes: Sections 6.1, 6.2, 6.3 and 6.4 Orthogonal Sets and Projections Math 6 Lecture Notes: Sections 6., 6., 6. and 6. Orthogonal Sets and Projections We will not cover general inner product spaces. We will, however, focus on a particular inner product space the inner product

More information

Chapter 12. Analysis of variance

Chapter 12. Analysis of variance Serik Sagitov, Chalmers and GU, January 9, 016 Chapter 1. Analysis of variance Chapter 11: I = samples independent samples paired samples Chapter 1: I 3 samples of equal size J one-way layout two-way layout

More information

STATISTICS 368 AN EXPERIMENT IN AIRCRAFT PRODUCTION Christopher Wiens & Douglas Wiens

STATISTICS 368 AN EXPERIMENT IN AIRCRAFT PRODUCTION Christopher Wiens & Douglas Wiens STATISTICS 368 AN EXPERIMENT IN AIRCRAFT PRODUCTION Christopher Wiens & Douglas Wiens April 21, 2005 The progress of science. 1. Preliminary description of experiment We set out to determine the factors

More information

3. Design Experiments and Variance Analysis

3. Design Experiments and Variance Analysis 3. Design Experiments and Variance Analysis Isabel M. Rodrigues 1 / 46 3.1. Completely randomized experiment. Experimentation allows an investigator to find out what happens to the output variables when

More information

Example 1 describes the results from analyzing these data for three groups and two variables contained in test file manova1.tf3.

Example 1 describes the results from analyzing these data for three groups and two variables contained in test file manova1.tf3. Simfit Tutorials and worked examples for simulation, curve fitting, statistical analysis, and plotting. http://www.simfit.org.uk MANOVA examples From the main SimFIT menu choose [Statistcs], [Multivariate],

More information

BIOS 2083 Linear Models c Abdus S. Wahed

BIOS 2083 Linear Models c Abdus S. Wahed Chapter 5 206 Chapter 6 General Linear Model: Statistical Inference 6.1 Introduction So far we have discussed formulation of linear models (Chapter 1), estimability of parameters in a linear model (Chapter

More information

CHAPTER 4 Analysis of Variance. One-way ANOVA Two-way ANOVA i) Two way ANOVA without replication ii) Two way ANOVA with replication

CHAPTER 4 Analysis of Variance. One-way ANOVA Two-way ANOVA i) Two way ANOVA without replication ii) Two way ANOVA with replication CHAPTER 4 Analysis of Variance One-way ANOVA Two-way ANOVA i) Two way ANOVA without replication ii) Two way ANOVA with replication 1 Introduction In this chapter, expand the idea of hypothesis tests. We

More information

General Linear Model: Statistical Inference

General Linear Model: Statistical Inference Chapter 6 General Linear Model: Statistical Inference 6.1 Introduction So far we have discussed formulation of linear models (Chapter 1), estimability of parameters in a linear model (Chapter 4), least

More information

Two-Way Analysis of Variance - no interaction

Two-Way Analysis of Variance - no interaction 1 Two-Way Analysis of Variance - no interaction Example: Tests were conducted to assess the effects of two factors, engine type, and propellant type, on propellant burn rate in fired missiles. Three engine

More information

Mathematics Qualifying Examination January 2015 STAT Mathematical Statistics

Mathematics Qualifying Examination January 2015 STAT Mathematical Statistics Mathematics Qualifying Examination January 2015 STAT 52800 - Mathematical Statistics NOTE: Answer all questions completely and justify your derivations and steps. A calculator and statistical tables (normal,

More information

1 Use of indicator random variables. (Chapter 8)

1 Use of indicator random variables. (Chapter 8) 1 Use of indicator random variables. (Chapter 8) let I(A) = 1 if the event A occurs, and I(A) = 0 otherwise. I(A) is referred to as the indicator of the event A. The notation I A is often used. 1 2 Fitting

More information

Stat 217 Final Exam. Name: May 1, 2002

Stat 217 Final Exam. Name: May 1, 2002 Stat 217 Final Exam Name: May 1, 2002 Problem 1. Three brands of batteries are under study. It is suspected that the lives (in weeks) of the three brands are different. Five batteries of each brand are

More information

1 One-way Analysis of Variance

1 One-way Analysis of Variance 1 One-way Analysis of Variance Suppose that a random sample of q individuals receives treatment T i, i = 1,,... p. Let Y ij be the response from the jth individual to be treated with the ith treatment

More information

Multi-Factor Experiments

Multi-Factor Experiments ETH p. 1/31 Multi-Factor Experiments Twoway anova Interactions More than two factors ETH p. 2/31 Hypertension: Effect of biofeedback Biofeedback Biofeedback Medication Control + Medication 158 188 186

More information

STAT420 Midterm Exam. University of Illinois Urbana-Champaign October 19 (Friday), :00 4:15p. SOLUTIONS (Yellow)

STAT420 Midterm Exam. University of Illinois Urbana-Champaign October 19 (Friday), :00 4:15p. SOLUTIONS (Yellow) STAT40 Midterm Exam University of Illinois Urbana-Champaign October 19 (Friday), 018 3:00 4:15p SOLUTIONS (Yellow) Question 1 (15 points) (10 points) 3 (50 points) extra ( points) Total (77 points) Points

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression Simple linear regression tries to fit a simple line between two variables Y and X. If X is linearly related to Y this explains some of the variability in Y. In most cases, there

More information

Written Exam (2 hours)

Written Exam (2 hours) M. Müller Applied Analysis of Variance and Experimental Design Summer 2015 Written Exam (2 hours) General remarks: Open book exam. Switch off your mobile phone! Do not stay too long on a part where you

More information

STAT 705 Chapter 16: One-way ANOVA

STAT 705 Chapter 16: One-way ANOVA STAT 705 Chapter 16: One-way ANOVA Timothy Hanson Department of Statistics, University of South Carolina Stat 705: Data Analysis II 1 / 21 What is ANOVA? Analysis of variance (ANOVA) models are regression

More information

STATISTICS 174: APPLIED STATISTICS FINAL EXAM DECEMBER 10, 2002

STATISTICS 174: APPLIED STATISTICS FINAL EXAM DECEMBER 10, 2002 Time allowed: 3 HOURS. STATISTICS 174: APPLIED STATISTICS FINAL EXAM DECEMBER 10, 2002 This is an open book exam: all course notes and the text are allowed, and you are expected to use your own calculator.

More information

Mathematics Ph.D. Qualifying Examination Stat Probability, January 2018

Mathematics Ph.D. Qualifying Examination Stat Probability, January 2018 Mathematics Ph.D. Qualifying Examination Stat 52800 Probability, January 2018 NOTE: Answers all questions completely. Justify every step. Time allowed: 3 hours. 1. Let X 1,..., X n be a random sample from

More information

COMPLETELY RANDOMIZED DESIGNS (CRD) For now, t unstructured treatments (e.g. no factorial structure)

COMPLETELY RANDOMIZED DESIGNS (CRD) For now, t unstructured treatments (e.g. no factorial structure) STAT 52 Completely Randomized Designs COMPLETELY RANDOMIZED DESIGNS (CRD) For now, t unstructured treatments (e.g. no factorial structure) Completely randomized means no restrictions on the randomization

More information

Statistics - Lecture Three. Linear Models. Charlotte Wickham 1.

Statistics - Lecture Three. Linear Models. Charlotte Wickham   1. Statistics - Lecture Three Charlotte Wickham wickham@stat.berkeley.edu http://www.stat.berkeley.edu/~wickham/ Linear Models 1. The Theory 2. Practical Use 3. How to do it in R 4. An example 5. Extensions

More information

Analysis of Variance. Read Chapter 14 and Sections to review one-way ANOVA.

Analysis of Variance. Read Chapter 14 and Sections to review one-way ANOVA. Analysis of Variance Read Chapter 14 and Sections 15.1-15.2 to review one-way ANOVA. Design of an experiment the process of planning an experiment to insure that an appropriate analysis is possible. Some

More information

Chapter 9: Hypothesis Testing Sections

Chapter 9: Hypothesis Testing Sections Chapter 9: Hypothesis Testing Sections 9.1 Problems of Testing Hypotheses 9.2 Testing Simple Hypotheses 9.3 Uniformly Most Powerful Tests Skip: 9.4 Two-Sided Alternatives 9.6 Comparing the Means of Two

More information

Orthogonal contrasts for a 2x2 factorial design Example p130

Orthogonal contrasts for a 2x2 factorial design Example p130 Week 9: Orthogonal comparisons for a 2x2 factorial design. The general two-factor factorial arrangement. Interaction and additivity. ANOVA summary table, tests, CIs. Planned/post-hoc comparisons for the

More information

Some General Types of Tests

Some General Types of Tests Some General Types of Tests We may not be able to find a UMP or UMPU test in a given situation. In that case, we may use test of some general class of tests that often have good asymptotic properties.

More information

Orthogonal Complements

Orthogonal Complements Orthogonal Complements Definition Let W be a subspace of R n. If a vector z is orthogonal to every vector in W, then z is said to be orthogonal to W. The set of all such vectors z is called the orthogonal

More information

Multiple Regression: Example

Multiple Regression: Example Multiple Regression: Example Cobb-Douglas Production Function The Cobb-Douglas production function for observed economic data i = 1,..., n may be expressed as where O i is output l i is labour input c

More information

Ch 3: Multiple Linear Regression

Ch 3: Multiple Linear Regression Ch 3: Multiple Linear Regression 1. Multiple Linear Regression Model Multiple regression model has more than one regressor. For example, we have one response variable and two regressor variables: 1. delivery

More information

Asymptotic Statistics-VI. Changliang Zou

Asymptotic Statistics-VI. Changliang Zou Asymptotic Statistics-VI Changliang Zou Kolmogorov-Smirnov distance Example (Kolmogorov-Smirnov confidence intervals) We know given α (0, 1), there is a well-defined d = d α,n such that, for any continuous

More information

MODELS WITHOUT AN INTERCEPT

MODELS WITHOUT AN INTERCEPT Consider the balanced two factor design MODELS WITHOUT AN INTERCEPT Factor A 3 levels, indexed j 0, 1, 2; Factor B 5 levels, indexed l 0, 1, 2, 3, 4; n jl 4 replicate observations for each factor level

More information

Stat 5102 Final Exam May 14, 2015

Stat 5102 Final Exam May 14, 2015 Stat 5102 Final Exam May 14, 2015 Name Student ID The exam is closed book and closed notes. You may use three 8 1 11 2 sheets of paper with formulas, etc. You may also use the handouts on brand name distributions

More information

Confidence Intervals, Testing and ANOVA Summary

Confidence Intervals, Testing and ANOVA Summary Confidence Intervals, Testing and ANOVA Summary 1 One Sample Tests 1.1 One Sample z test: Mean (σ known) Let X 1,, X n a r.s. from N(µ, σ) or n > 30. Let The test statistic is H 0 : µ = µ 0. z = x µ 0

More information

Cuckoo Birds. Analysis of Variance. Display of Cuckoo Bird Egg Lengths

Cuckoo Birds. Analysis of Variance. Display of Cuckoo Bird Egg Lengths Cuckoo Birds Analysis of Variance Bret Larget Departments of Botany and of Statistics University of Wisconsin Madison Statistics 371 29th November 2005 Cuckoo birds have a behavior in which they lay their

More information

Analysis of variance using orthogonal projections

Analysis of variance using orthogonal projections Analysis of variance using orthogonal projections Rasmus Waagepetersen Abstract The purpose of this note is to show how statistical theory for inference in balanced ANOVA models can be conveniently developed

More information

Example - Alfalfa (11.6.1) Lecture 14 - ANOVA cont. Alfalfa Hypotheses. Treatment Effect

Example - Alfalfa (11.6.1) Lecture 14 - ANOVA cont. Alfalfa Hypotheses. Treatment Effect (11.6.1) Lecture 14 - ANOVA cont. Sta102 / BME102 Colin Rundel March 19, 2014 Researchers were interested in the effect that acid has on the growth rate of alfalfa plants. They created three treatment

More information

PROBLEM TWO (ALKALOID CONCENTRATIONS IN TEA) 1. Statistical Design

PROBLEM TWO (ALKALOID CONCENTRATIONS IN TEA) 1. Statistical Design PROBLEM TWO (ALKALOID CONCENTRATIONS IN TEA) 1. Statistical Design The purpose of this experiment was to determine differences in alkaloid concentration of tea leaves, based on herb variety (Factor A)

More information

11 Hypothesis Testing

11 Hypothesis Testing 28 11 Hypothesis Testing 111 Introduction Suppose we want to test the hypothesis: H : A q p β p 1 q 1 In terms of the rows of A this can be written as a 1 a q β, ie a i β for each row of A (here a i denotes

More information

UNIVERSITÄT POTSDAM Institut für Mathematik

UNIVERSITÄT POTSDAM Institut für Mathematik UNIVERSITÄT POTSDAM Institut für Mathematik Testing the Acceleration Function in Life Time Models Hannelore Liero Matthias Liero Mathematische Statistik und Wahrscheinlichkeitstheorie Universität Potsdam

More information