Analysis of variance using orthogonal projections

Size: px
Start display at page:

Download "Analysis of variance using orthogonal projections"

Transcription

1 Analysis of variance using orthogonal projections Rasmus Waagepetersen Abstract The purpose of this note is to show how statistical theory for inference in balanced ANOVA models can be conveniently developed using orthogonal projections. The approach is exemplified for one- and twoway ANOVA and finally some hints are given regarding extension to three- or higher-way ANOVA. 1 Prerequisites on factors and projections Analysis of variance (ANOVA) models are specified in terms of grouping variables or factors. Suppose observations y i are indexed by a set I of cardinality n. A factor F is a function that assigns a grouping label among a finite set of labels to each observation. E.g. F (i) = q means that observation y i (or index i) is assigned to group/level q for the factor F. Suppose F generates k groups. The design matrix Z F corresponding to F is then n k and the iqth entry of Z F is 1 if i is assigned to group q and 0 otherwise. Note that in many applications, i is a multi-index of the form i = i 1 i 2... i p for p 1. For any factor F we denote by L F the column space of Z F. The orthogonal projection on L F is denoted P F. The result of applying P F to a vector (y i ) i I is that y i is replaced by the average of the observations in the group that i belongs to (i.e. the average of those y l for which F (l) = F (i)). Two factors play a special role. The unit factor I has a unique level for each observation so L I = R n and P I = I (with an abuse of notation I is used both for factor and identity matrix). The factor 0 assigns all observations to the same group so L 0 = span{1 n } and P 0 = 1 n 1 T n/n. Note that P 0 y is simply the vector where each component is given by the average ȳ = i I y i/n. For any factor F, L 0 L F L I. 1

2 A factor F is said to be balanced if there is a common number m of observations at each of the k levels (whereby n = mk). In this case, the orthogonal projection P F on L F is P F = 1 m Z F Z T F. This type of identity is crucial in the following and the reason why we focus on balanced factors. 2 One-way ANOVA Consider the model Y ij = ξ + U i + ɛ ij, i = 1,..., k, j = 1,..., n i, where ξ R, U i N(0, τ 2 ), ɛ ij N(0, σ 2 ), τ 2, σ 2 0, and the variables U i and ɛ ij, are independent i = 1,..., k and j = 1,..., n i. Let I consist of indices ij for i = 1,..., k and j = 1,..., n i, and define the factor F by F (ij) = i. Then stacking variables on top of each other we can write the model in vector form as Y = 1 n ξ + Z F U + ɛ where n is the total number of observations. As noted in the previous section, Z F is the design matrix corresponding to F : the ij, qth entry of Z F is 1 if Y ij belongs to the qth group and zero otherwise. 2.1 Orthogonal decomposition We now obtain an orthogonal decomposition of R n : R n = V 0 V F V I where V 0 = L 0 = span(1 n ), V F = L F V 0 and V I = R n L F. Here denotes sum of orthogonal subspaces: L 1 L 2 = {x + y x L 1, y L 2 } for orthogonal subspaces L 1 and L 2, and denotes orthogonal complement: L 2 L 1 = {x L 2 x T y = 0 y L 1 } 2

3 for L 1 L 2. The dimensions of V 0, V F and V I are 1, k 1 and n k, and the orthogonal projections on V 0, V F and V I are Q 0 = P 0, Q F = P F P 0 and Q I = I P F (see exercise 1). Using the orthogonal projections we also obtain an orthogonal decomposition of the data vector: Y = Q 0 Y + Q F Y + Q I Y into components falling in V 0, V F and V I. The covariance matrix is decomposed as: CovY = CovZ F U + Covɛ = mτ 2 P F + σ 2 I = λp F + σ 2 Q I where λ = mτ 2 + σ 2. Here we used CovZ F U = τ 2 Z F Z T F = τ 2 mp F and I = P F + Q I = 0. Note that there is a one-to-one correspondence between the pairs (λ, σ 2 ) and (τ 2, σ 2 ). The components Q 0 Y, Q F Y and Q I Y are independent since their covariances are zero. For example Cov(Q 0 Y, Q F Y ) = Q 0 ΣQ F = Q 0 (λp F + σ 2 Q I )Q F which is zero since P F = Q 0 +Q F and all products Q 0 Q F = Q 0 Q I = Q F Q I = 0. Since CovY = λp F + σ 2 Q I = CovP F Y + CovQ I Y we also consider in the next section the coarser decomposition Y = P F Y + Q I Y of Y into independent components P F Y and Q I Y falling in L F and V I. 2.2 Estimation using factorization of likelihood Note P F 1 n = Q 0 1 n = 1 n and Q I 1 n = 0. Moreover Q I P F = 0. Hence [ ] (( ) [ ]) PF 1n ξ λpf 0 Y N, Q I 0 n 0 σ 2. Q I We can thus base maximum likelihood estimation of (ξ, λ) on P F Y and maximum likelihood estimation of σ 2 on Q I Y. 3

4 More precisely, Σ 1/2 exp( 1 2 (Y 1 nξ) T Σ 1 (Y 1 n ξ)) = λ k/2 exp( 1 2λ P F Y 1 n ξ 2 ) (σ 2 ) k(m 1)/2 exp( 1 2σ 2 Q IY 2 ) where we used Σ 1 = σ 2 Q I + λ 1 P F and Σ = λ k (σ 2 ) mk k (see exercises 2,3,5). The two factors in the above likelihood are in fact generalized densities of the degenerate normal vectors P F Y and Q I Y (i.e. these vectors are n dimensional but their distributions are concentrated on lowerdimensional subspaces of dimension k and n k). Consider e.g. the factor λ k/2 exp( 1 P 2 F Y 1 n ξ 2 ) involving the parameters λ and ξ. We can maximize this with respect to λ and ξ in exactly the same way as when we previously considered the likelihood of a linear normal model N n (µ, σ 2 I). Since L 0 = span{1 n } we thus obtain 1 n ξ = P 0 P F Y = P 0 Y and ˆλ = P F Y P 0 Y 2 /k = SSB/k. Proceeding in the same way for the second factor (where there is no mean parameter), we obtain ˆσ 2 = Q I Y 2 /(k(m 1)) = SSE/(k(m 1)). Note that Q F Y N n (0, λq F ). By exercise 6, P F Y P 0 Y 2 = Q F Y 2 λχ 2 (k 1) which has mean λ(k 1). Thus ˆλ is biased. An unbiased (REML) estimate is given by λ = P F Y P 0 Y 2 /(k 1) = SSB/(k 1). 3 Two-way analysis of variance Consider now a model with two factors T (treatment) and P (plot) with numbers of levels d T and d P. Moreover let P T be the cross-factor (has d P d T levels - one for each combination of levels of P and T ). More specifically, I consists of indices ptr and the factor mappings are P (ptr) = p, T (ptr) = t, and P T (ptr) = pt. Assume P T is balanced with n P T = m observations for each level. Then P and T are balanced too with numbers of observations n P = md T and n T = md P for each level. 4

5 A two-way ANOVA model with fixed T effects and random P and P T effects is Y ptr = ξ + β t + U p + U pt + ɛ ptr p = 1,..., d P, t = 1,..., d T, r = 1,..., m, where the U p N(0, σ 2 P ), U pt N(0, σ 2 P T ) and ɛ ptr N(0, σ 2 ) are independent. With a convenient abuse of terminology we refer to P and P T as random factors. The random effects U p and U pt can e.g. serve to model soil variation between plots in a field in case of an agricultural experiment. In vector form the model is Y = µ + Z P U P + Z P T U P T + ɛ where µ L T. The vectors Y, U P and U P T are again obtained by stacking, e.g. U P = (U 1,..., U dp ) T. 3.1 Orthogonal decomposition Similar to the one-way ANOVA we define: V 0 = L 0 V T = L T V 0 V P = L P V 0. Since P T is balanced it follows that P T P P = P P P T = P 0 (exercise 7) which implies Q T Q P = 0 where Q T = P T P 0 and Q P = P P P 0. Hence V T and V P are orthogonal and we obtain an orthogonal decomposition of the sum of L P and L T : We further define L P + L T = {v + w v L P, w L T } = V 0 V P V T. V P T = L P T (L P + L T ), V I = R n L P T and obtain the orthogonal decomposition: R n = V 0 V P V T V P T V I. The dimensions of the V spaces are f 0 = 1, f P = d P 1, f T = d T 1, f P T = d P d T d P d T + 1 = (d P 1)(d T 1), f I = n d P d T. The orthogonal projections on the V spaces are : Q 0 = P 0, Q P = P P Q 0, Q T = P T Q 0, Q P T = P P T Q P Q T Q 0 and Q I = I P P T. 5

6 In line with the orthogonal decomposition of R n we also obtain two decompositions of Y into orthogonal components: Y = Q 0 Y + Q P Y + Q T Y + Q P T Y + Q I Y = Q P Y + Q P T Y + Q I Y where Q P = Q 0 + Q P = P P QP T = Q P T + Q T and Q I = Q I. The second decomposition corresponds to a coarser decomposition R n = ṼP ṼP T ṼI into orthogonal subspaces ṼP = V 0 V P = L P, Ṽ P T = V T V P T and Ṽ I = V I associated with each random factor P, P T and I. The coarser decomposition of R n corresponds to a decomposition of the covariance matrix using the Q projection matrices: where CovY = σ 2 P n P P P + σ 2 P T n P T P P T + σ 2 I = λ P QP + λ P T QP T + λ I QI λ P = σ 2 + n P T σ 2 P T + n P σ 2 P (1) λ P T = σ 2 + n P T σ 2 P T (2) λ I = σ 2. The mean vector µ L T = V 0 V T is decomposed as: µ = Q P µ + Q P T µ + Q I µ = Q 0 µ + Q T µ + 0 = µ 0 + µ T where µ 0 L 0 and µ T V T. Note the one-to-one correspondences between the σ 2 s and the λ s and between µ and (µ 0,µ T ). 3.2 Relation to sum-to-zero constraint In the model for the mean vector µ, the parameters ξ and β 1,..., β dt identifiable. This is because are not µ ptr = ξ + β t = (ξ + k) + (β t k) = ξ + β t. 6

7 Hence if we add k to ξ we obtain the same mean vector by just subtracting k from the β t s. One way to enforce identifiability is to require that one β t is zero (i.e. choosing a reference group). Another option is to introduce the sum-to-zero constraint: dt t=1 β t = 0. Note that vectors v V T are of the form v ptr = β t since they lie in L T and they further satisfy the sum-to-zero constraint 1 T nv = d P m d T t=1 β t = 0 due to the orthogonality of V 0 and V T. In other words, enforcing the sum-to-zero constraint on the β t parameters corresponds to the decomposition of µ into a constant vector ξ1 n falling in V 0 and a vector v = (β t ) ptr V T. 3.3 Estimation using factorization of likelihood As for one-way ANOVA, Y is decomposed into independent normal vectors Q P Y N(µ 0, λ P QP ) QP T Y N(µ T, λ P T QP T ) QI Y N(0, λ I QI ). Accordingly, the density for Y factorizes into a product of (generalized) densities for Q P Y, Q P T Y and Q I Y similar to the case of the one-way ANOVA. For instance, the density for Q P Y is proportional to λ d P /2 P exp( 1 2λ P Q P Y µ 0 2 ). Thus in analogy with estimation in the usual linear model Y N n (µ, σ 2 I) we obtain ˆµ 0 = P 0 QP Y = P 0 Y = 1 n Ȳ and Similarly, ˆλ P = Q P Y P 0 Y 2 /d P = Q P Y 2 /d P. ˆµ T = Q T QP T Y = Q T Y, ˆµ = ˆµ 0 + ˆµ T = P T Y, ˆλ P T = Q P T Y Q T Y 2 /(d P d T d P ) = Q P T Y 2 /(d P d T d P ), and ˆλ I = ˆσ 2 = Q I Y 2 /(n d P d T ). 7

8 Since V P and V P T have dimensions f P = d P 1 and f P T = (d P 1)(d T 1) we often use these (cf. exercise 6) denominators instead of d P and d P d T d P for λ P and λ P T thereby obtaining the unbiased estimates λ P = Q P Y 2 /f P λp T = Q P T Y 2 /f P T of λ P and λ P T. Note that the MLE for µ is in fact identical to the MLE in the linear model with µ L T and no random effects Estimation of original variance components Estimates of the original variances σp 2 and σ2 P T can be obtained by simply solving (1) and (2) with respect to these parameters. That is, σ 2 P = ( λ P λ P T )/n P σ 2 P T = ( λ P T σ 2 )/n P T. Note that there is no guarantee that these estimates are non-negative. In fact there could be several reasons for obtaining negative variance estimates: if e.g. the true σp 2 is zero or close to zero it is not unlikely to encounter λ P < λ P T due to sampling variation. observations within a group could be negatively correlated which could be reflected by a negative variance estimate (remember definition of intra-class correlation). data deviates in some way (e.g. outliers) from the assumed model. In general, if a negative variance estimate is obtained, it is recommended (as always) to check carefully whether model assumptions seem valid. If data does not seem to contradict the model, one may consider refitting the model without the random factor for which a negative variance estimate was obtained. If negative correlation within groups of the data is plausible one may look into an alternative model to the assumed linear mixed model. 4 Strata A crucial common point for the one- and two-way ANOVA models in the previous sections is that the factorization of the likelihood is based on an orthogonal decomposition induced by the random factors only. This is due 8

9 P I PxT O T Figure 1: Structure diagram for two-way ANOVA with random effects. to the decomposition of the covariance matrix into terms given by scaled projection matrices for the random factors. The fixed effects are decomposed into components according to V spaces for the fixed factors. These components are then assigned to strata corresponding to Ṽ spaces for the random factors. We say that a factor F is finer than G (or G coarser than F ) if levels of G can be obtained by merging levels of F. We then write G F. The rule for allocating fixed effects to strata is then that F belongs to B strata (where B is a random factor) if B is the coarsest random factor which is finer than F. This is of course assuming that this rule makes sense, i.e. we work with an experimental design where this rule gives a unique allocation of each fixed factor to one random factor. It is for instance required that I is a random factor. Also, for fixed F and random B 1, B 2, we can not have both F B 1 and F B 2 unless either B 1 B 2 or B 2 B 1. To get an overview of allocation to strata it is useful to draw a structure diagram as shown for the two-way ANOVA in Figure 1. In the structure diagram there is an arrow from F to G if G is coarser than F and there is no intermediate factor which is coarser than F and finer than G. In Figure 1 9

10 there are three strata (black, green and red) corresponding to the random factors I, P T and P. Note that here we can not have both P and T random unless O is random too (if O was fixed in this case there would not be a unique allocation of O to random factor strata). We also want to respect the hierarchical principle for the mean structure so P T can not be fixed if P is random. A more general discussion of allocation to strata is given in Section 7. 5 Hypothesis tests and confidence intervals We here exemplify how hypothesis tests are conducted in balanced ANOVAs by considering one- and two-way ANOVAs. 5.1 One-way ANOVA For the one-way ANOVA with random effects consider the F -statistic F = SSB/(k 1) SSE/(k(m 1)) = λ ˆσ 2 σ 2 + mτ 2 σ 2 F (k 1, k(m 1)) = (1 + mγ)f (k 1, k(m 1)) where γ = τ 2 /σ 2 is the signal to noise ratio. Thus with q L and q U e.g. 2.5% and 97.5% quantiles for F (k 1, k(m 1)), we have P (q L F/(1 + mγ) q U ) = 95% P ((F/q U 1)/m γ (F/q L 1)/m) = 95% so that [(F/q U 1)/m, (F/q L 1)/m)] is a 95% confidence interval for γ, see also Remark 5.10 in [1]. One hypothesis is of particular interest: H 0 : τ 2 = 0, i.e. there is no variation between groups. A simple F -test for this is based on the observation τ 2 = 0 γ = 0 σ 2 = λ. 10

11 Thus large values of the F statistic above is critical for H 0 and under H 0, F has a F (k 1, k(m 1)) distribution. Note that the F statistic coincides with the F -test for no group effects in a one-way ANOVA with fixed group effects (which is equivalent to the likelihood ratio test for no group effects in the fixed effects one-way ANOVA). 5.2 Tests for fixed effects in two-way ANOVA For the two-way ANOVA considered in Section 3 the interesting hypothesis regarding fixed effects is H 0 : β 1 = β 2 = = β dt i.e. no fixed group effects. This is equivalent to that µ T = 0 (referring to the notation of Section 3.1). Thus for constructing the likelihood ratio test we only need to consider the factor corresponding to the term Q P T Y N(µ T, λ P T QP T ) which has density λ (d P d T d P )/2 P T exp( 1 2λ P T Q P T Y µ T 2 ). Thus the situation is equivalent to testing µ = 0 for a linear normal model N dp d T d P (µ, λ P T I). We can thus proceed as for a usual linear normal model (without random effects) and obtain the F -test F = Q T Y 2 λ P T F (d T 1, (d P 1)(d T 1)). 5.3 Test for variance components in two-way ANOVA Recall λ I = σ 2, λ P T = σ 2 + n P T σ 2 P T and λ P = σ 2 + n P T σ 2 P T + n P σ 2 P. Hence e.g. σ 2 P T = 0 λ I = λ P T. Thus a natural statistic (but not likelihood-ratio statistic) for testing σ 2 P T = 0 is F = λ P T ˆσ 2 which has an F ((d P 1)(d T 1), n d P T ) distribution if σp 2 T = 0. Big values of F are critical for σp 2 T = 0. Note λ P T = Q P T Y 2 /((d P 1)(d T 1)) so F is identical to the F -statistic for testing the hypothesis of no (fixed) effect of the factor P T in a linear normal model without random effects. 11

12 5.4 Confidence intervals for parameters in two-way ANOVA Regarding the mean vector µ = 1 n ξ + Z T β where β = (β 1,..., β dt ), we can consider confidence intervals for components of ξ + β i of µ or for contrasts δ ij = β i β j corresponding to differences between group means. Here it is easy to derive confidence intervals for the contrast: employing the orthogonal decomposition µ = µ 0 + µ T, the contrasts only involve µ T in the P T stratum. E.g. if c is a vector with c 1i1 = 1, c 1j1 = 1 and zeroes elsewhere then δ ij = β i β j = c T µ = c T µ T. Thus δ ij can be estimated as ˆδ ij = c Tˆµ T = c T Q T QP T Y N(β i β j, λ P T c T Q T c) and we can thus follow the usual route to constructing confidence intervals based on the t-distribution combining the above normal distribution for ˆδ ij and the scaled χ 2 distribution of λ P T with degrees of freedom (d P 1)(d T 1). Note c T Q T QP T Y = c T Q T Y = ȳ i ȳ j and c T Q T c = c T P T c c T P 0 c = c T P T c = (1, 1)(1/n T, 1/n T ) T = 2 n T. Hence the t-statistic becomes ȳ i ȳ j 2 λ P T /n T which has a t((d P 1)(d T 1)) distribution under the null hypothesis β i = β j. You can also get the above results by simply observing that ȳ i ȳ j = β i β j + m U i m U j + 1 ɛ i 1 ɛ j n T n T n T n ( T ) 2 N β i β j, (mσp 2 T + σ 2 ). n T Note here m = n P T so mσ P T + σ 2 = λ P T. Confidence intervals for the λ variance parameters (including σ 2 = λ I ) are straightforward due to the exact scaled χ 2 distributions of their estimates. Confidence intervals for the original variance parameters (other than σ 2 ) are less straightforward since estimates for these are distributed as differences of scaled χ 2 distributions). 12

13 6 Orthogonal decomposition for a K-way ANOVA In this section we first construct an orthogonal decomposition for a balanced three-way ANOVA and then generalize the approach to a K-way balanced ANOVA for arbitrary K 1. We know by now that for a two-sided ANOVA corresponding to factors F 1 and F 2 with F 1 F 2 balanced, there exists a decomposition of R n into orthogonal subspaces V I, V F1 F 2, V F1, V F2 and V 0. Suppose we introduce a third factor F 3 so that F 1 F 2 F 3 is balanced too. Then also {V I, V F1 F 3, V F1, V F3, V 0 } and {V I, V F2 F 3, V F2, V F3, V 0 } are sets of orthogonal subspaces. By analogy with two-way ANOVA with factors F = F 1 F 2 and G = F 3 it also follows that V F1 F 2 and V F3 are orthogonal and similarly for the pairs V F1 F 3, V F2 and V F2 F 3, V F1. It is also easy to see that P F1 F 2 P F2 F 3 = P F2. (3) This implies V F1 F 2 and V F2 F 3 are orthogonal. To see this note that the corresponding projections are Q F1 F 2 = P F1 F 2 P F2 Q F1 and Q F2 F 3 = P F2 F 3 P F2 Q F3. We now check that Q F1 F 2 Q F2 F 3 = 0 from which the required orthogonality follows: Q F1 F 2 Q F2 F 3 = (P F1 F 2 P F2 Q F1 )(P F2 F 3 P F2 Q F3 ) = P F2 P F2 P F1 F 2 Q F3 P F2 + P F2 0 Q F1 P F2 F = 0. Here, P F1 F 2 Q F3 = 0 in analogy with a two-way ANOVA with factors F = F 1 F 2 and G = F 3, and Q F1 P F2 F 3 = 0 by the same argument. Thus V F1 F 2, V F1 F 3 and V F2 F 3 are orthogonal too. Defining V F1 F 2 F 3 = L F1 F 2 F 3 (V 0 V F1 V F2 V F3 V F1 F 2 V F1 F 3 V F2 F 3 ) V I = R n V F1 F 2 F 3 we arrive at the orthogonal decomposition R n = V I A {1,2,3} V l A F l letting V l F l = V 0. The approach for the three-way ANOVA can be generalized to obtain the following main result: 13

14 Theorem 1. Suppose F 1 F K is balanced, K 1. Define, recursively, for A {1,..., K}, V l A F l = L l A F l B A V l B F l with V l F l = V 0 = L 0 and let V I = R n L K l=1 F l. Then we have the orthogonal decomposition R n = V I A {1,...,K} V l A F l. Proof. We need to show that (*) V l A F l and V l B F l are orthogonal whenever A, B {1,..., K}, A B (this is already shown for K = 1, 2, 3). If A B = then (*) follows by analogy to a two-way ANOVA with factors F = l A F l and G = l B F l. If B A we define F = l B F l and G = l A\B F l. Then V l A F l = V F G and the result again follows by analogy to a two-way ANOVA. The case A B is handled similarly. Finally if C = A B is non-empty and neither A B nor B A, let F = l A\C F l, G = l B\C F l, and H = l C F l. Then (*) holds by analogy to a three-way ANOVA. 7 Decomposition into strata In this section we discuss a general decomposition into strata corresponding to factors with random effects. Let D be a set of factors and let B D denote the set of factors with random effects. We assume the factors in D are distinct in the sense that we can not have both F F and F F for different F, F D (otherwise two factors in D would induce the same grouping of the observations). there is an orthogonal decomposition R n = F D V F of R n into a sum of subspaces V F, F D, with associated orthogonal projection matrices Q F. the covariance matrix of the data vector is decomposed as Σ = B B σ 2 Bn B P B 14

15 where for B B, P B = Q F. (4) F D:F B We want Q B s, B B, so that P B = B B:B B Q B (5) where for each B B, Q B is an orthogonal projection on a subspace ṼB where R n = B B Ṽ B and ṼB is a sum of V F subspaces. We want this because then we obtain the decomposition Σ = σbn 2 B P B = λ B QB where λ B = n B σb 2 B B B B B B:B B and hence a decomposition of Y into independent terms Q B Y, B B. The following theorem tells us when we may get what we want. Theorem 2. A necessary and sufficient condition for (5) is that: for all F D there exists a B B such that F B and B B for all other B with F B. If this condition holds we can define Ṽ B = V F and QB = Q F. F D:B(F )=B F D:B(F )=B Remark 1. Note that this condition implies that I has to be in B. This is not a serious restriction since we will always include measurement error in our models. It also implies that the B mentioned in the condition is unique. In the proof below we use the notation B(F ) for the unique B corresponding to F, F D. Proof. We first show the sufficiency. Each of the factors F B on the right hand side of (4) has B(F ) = B for precisely one B B. Hence P B = Q F B B:B B F D:B(F )=B Thus we can define for B B, a subspace and associated orthogonal projection Ṽ B = V F and QB = Q F. F D:B(F )=B F D:B(F )=B 15

16 Each F D belongs to precisely one ṼB so Rn = B BṼB. To show that the condition is necessary note first that to get (5), for each F D we must have V F ṼB for some B B with B B. Moreover by equating (4) and (5) we need F B. Suppose next that B i, i = 1,..., m, m 1, are the elements in B that satisfy F B i, i = 1,..., m. If m 2 assume by contradiction that for each B i there is a B j so that B i B j. Assume without loss of generality that V F ṼB 1 and that B 1 B 2. Since F B 2 we would then by (4) need to include Q B 1 in the sum (5) for P B 2 but this would be a contradiction with the restriction on the sum in (5) when B 1 B 2. 8 Concluding remarks This note essentially gives a simplified exposition of results presented in [3]. A general exposition can also be found in [2]. We here considered the use of orthogonal decompositions only in the context of balanced ANOVAs. However, the same principles apply for other types of linear mixed models allowing for relevant orthogonal decompositions of the sample space. One example is a random coefficient model for the well-known orthodontic data set consisting of data for 27 children each with 4 measurements. Here one can e.g. observe that the MLEs of the mean parameters are the same in the models with and without random child effects. References [1] Henrik Madsen and Poul Thyregod Introduction to general and generalized linear models. Chapman & Hall/CRC Texts in Statistical Science. [2] Jesper Møller Centrale statistiske modeller og likelihood baserede metoder, Department of Mathematical Sciences, Aarhus University. [3] Tue Tjur Analysis of variance models in orthogonal designs, International Statistical Review, 52,

17 A Orthogonal projections Suppose L is a subspace of R n, n 1. Let L = {v R n v w = 0 for all w L} denote its orthogonal complement. Orthogonal decomposition: each x R n has a unique decomposition where u L and v L. x = u + v Orthogonal projection: u and v above are the orthogonal projections p L (x) and p L (x) of x on respectively L and L. Due to orthogonality of u and v, we obtain Pythagoras theorem: x 2 = u 2 + v 2. A few facts regarding orthogonal projections: (a) the orthogonal projection p L : R n L on L is a linear mapping. It is thus given by a unique matrix-transformation p L (x) = P x where P is an n n matrix. (b) the projection matrix P is symmetric (P T = P ) and idempotent (P 2 = P ). (c) conversely, if a matrix Q is symmetric, idempotent and L = colq then Q is the matrix of the orthogonal projection on L. (d) if L = colx and X has full rank then P = X(X T X) 1 X T. (e) if P is the matrix of the orthogonal projection on L then I P is the matrix of the orthogonal projection on L. (f) if L 1 and L 2 are subspaces with L 1 L 2 and associated orthogonal projections P 1 and P 2, then the orthogonal projection on L 2 L 1 is P 2 P 1. Example: the orthogonal projection on a subspace spanned by a single vector v is vt x v 2 x. Example: the orthogonal projection on a subspace spanned by orthogonal vectors v 1,..., v p is p v T i x i=1 v i x. 2 17

18 B Exercises 1. Let L 1 L 2 with orthogonal projections P 1 and P 2. Show that P 2 P 1 is the orthogonal projection on L 2 L Show that an orthogonal projection only has eigen values 1 or For a symmetric matrix A show that the determinant A is the product of A s eigen values. 4. Show Q I P F = 0 5. Let S = ap + bq where P and Q are orthogonal projections with P + Q = I and a, b 0. Show that the eigen values of S are the nonzero eigen values a and b of ap and bq. Show that S 1 = a 1 P +b 1 Q. 6. Show that Y 2 σ 2 χ 2 (d) if Y N(0, σ 2 P ) and P is an orthogonal projection on a subspace of dimension d (hint: use spectral decomposition and the result above regarding the eigen values of P ). 7. Check that P P P T = P 0 when P T is balanced. 8. Check that L T + L P = V 0 V P V T. 18

Outline for today. Two-way analysis of variance with random effects

Outline for today. Two-way analysis of variance with random effects Outline for today Two-way analysis of variance with random effects Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark Two-way ANOVA using orthogonal projections March 4, 2018 1 /

More information

Linear models. Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark. October 5, 2016

Linear models. Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark. October 5, 2016 Linear models Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark October 5, 2016 1 / 16 Outline for today linear models least squares estimation orthogonal projections estimation

More information

Simple and Multiple Linear Regression

Simple and Multiple Linear Regression Sta. 113 Chapter 12 and 13 of Devore March 12, 2010 Table of contents 1 Simple Linear Regression 2 Model Simple Linear Regression A simple linear regression model is given by Y = β 0 + β 1 x + ɛ where

More information

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,

More information

Prediction. is a weighted least squares estimate since it minimizes. Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark

Prediction. is a weighted least squares estimate since it minimizes. Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark Prediction Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark March 22, 2017 WLS and BLUE (prelude to BLUP) Suppose that Y has mean β and known covariance matrix V (but Y need not

More information

Outline. Statistical inference for linear mixed models. One-way ANOVA in matrix-vector form

Outline. Statistical inference for linear mixed models. One-way ANOVA in matrix-vector form Outline Statistical inference for linear mixed models Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark general form of linear mixed models examples of analyses using linear mixed

More information

Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model

Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model Xiuming Zhang zhangxiuming@u.nus.edu A*STAR-NUS Clinical Imaging Research Center October, 015 Summary This report derives

More information

MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7

MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7 MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7 1 Random Vectors Let a 0 and y be n 1 vectors, and let A be an n n matrix. Here, a 0 and A are non-random, whereas y is

More information

Course topics (tentative) The role of random effects

Course topics (tentative) The role of random effects Course topics (tentative) random effects linear mixed models analysis of variance frequentist likelihood-based inference (MLE and REML) prediction Bayesian inference The role of random effects Rasmus Waagepetersen

More information

WLS and BLUE (prelude to BLUP) Prediction

WLS and BLUE (prelude to BLUP) Prediction WLS and BLUE (prelude to BLUP) Prediction Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark April 21, 2018 Suppose that Y has mean X β and known covariance matrix V (but Y need

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Maximum Likelihood Estimation Merlise Clyde STA721 Linear Models Duke University August 31, 2017 Outline Topics Likelihood Function Projections Maximum Likelihood Estimates Readings: Christensen Chapter

More information

Chapter 14. Linear least squares

Chapter 14. Linear least squares Serik Sagitov, Chalmers and GU, March 5, 2018 Chapter 14 Linear least squares 1 Simple linear regression model A linear model for the random response Y = Y (x) to an independent variable X = x For a given

More information

STAT 350: Geometry of Least Squares

STAT 350: Geometry of Least Squares The Geometry of Least Squares Mathematical Basics Inner / dot product: a and b column vectors a b = a T b = a i b i a b a T b = 0 Matrix Product: A is r s B is s t (AB) rt = s A rs B st Partitioned Matrices

More information

Outline for today. Maximum likelihood estimation. Computation with multivariate normal distributions. Multivariate normal distribution

Outline for today. Maximum likelihood estimation. Computation with multivariate normal distributions. Multivariate normal distribution Outline for today Maximum likelihood estimation Rasmus Waageetersen Deartment of Mathematics Aalborg University Denmark October 30, 2007 the multivariate normal distribution linear and linear mixed models

More information

Topic 22 Analysis of Variance

Topic 22 Analysis of Variance Topic 22 Analysis of Variance Comparing Multiple Populations 1 / 14 Outline Overview One Way Analysis of Variance Sample Means Sums of Squares The F Statistic Confidence Intervals 2 / 14 Overview Two-sample

More information

Math 423/533: The Main Theoretical Topics

Math 423/533: The Main Theoretical Topics Math 423/533: The Main Theoretical Topics Notation sample size n, data index i number of predictors, p (p = 2 for simple linear regression) y i : response for individual i x i = (x i1,..., x ip ) (1 p)

More information

Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models Introduction to General and Generalized Linear Models Mixed effects models - Part II Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby

More information

7. Dimension and Structure.

7. Dimension and Structure. 7. Dimension and Structure 7.1. Basis and Dimension Bases for Subspaces Example 2 The standard unit vectors e 1, e 2,, e n are linearly independent, for if we write (2) in component form, then we obtain

More information

Linear Models Review

Linear Models Review Linear Models Review Vectors in IR n will be written as ordered n-tuples which are understood to be column vectors, or n 1 matrices. A vector variable will be indicted with bold face, and the prime sign

More information

EXERCISE SET 5.1. = (kx + kx + k, ky + ky + k ) = (kx + kx + 1, ky + ky + 1) = ((k + )x + 1, (k + )y + 1)

EXERCISE SET 5.1. = (kx + kx + k, ky + ky + k ) = (kx + kx + 1, ky + ky + 1) = ((k + )x + 1, (k + )y + 1) EXERCISE SET 5. 6. The pair (, 2) is in the set but the pair ( )(, 2) = (, 2) is not because the first component is negative; hence Axiom 6 fails. Axiom 5 also fails. 8. Axioms, 2, 3, 6, 9, and are easily

More information

Multivariate Regression

Multivariate Regression Multivariate Regression The so-called supervised learning problem is the following: we want to approximate the random variable Y with an appropriate function of the random variables X 1,..., X p with the

More information

[y i α βx i ] 2 (2) Q = i=1

[y i α βx i ] 2 (2) Q = i=1 Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation

More information

Ch 2: Simple Linear Regression

Ch 2: Simple Linear Regression Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component

More information

ECON 4160, Autumn term Lecture 1

ECON 4160, Autumn term Lecture 1 ECON 4160, Autumn term 2017. Lecture 1 a) Maximum Likelihood based inference. b) The bivariate normal model Ragnar Nymoen University of Oslo 24 August 2017 1 / 54 Principles of inference I Ordinary least

More information

Linear models and their mathematical foundations: Simple linear regression

Linear models and their mathematical foundations: Simple linear regression Linear models and their mathematical foundations: Simple linear regression Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/21 Introduction

More information

Peter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8

Peter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8 Contents 1 Linear model 1 2 GLS for multivariate regression 5 3 Covariance estimation for the GLM 8 4 Testing the GLH 11 A reference for some of this material can be found somewhere. 1 Linear model Recall

More information

MAT2342 : Introduction to Applied Linear Algebra Mike Newman, fall Projections. introduction

MAT2342 : Introduction to Applied Linear Algebra Mike Newman, fall Projections. introduction MAT4 : Introduction to Applied Linear Algebra Mike Newman fall 7 9. Projections introduction One reason to consider projections is to understand approximate solutions to linear systems. A common example

More information

Asymptotic Statistics-VI. Changliang Zou

Asymptotic Statistics-VI. Changliang Zou Asymptotic Statistics-VI Changliang Zou Kolmogorov-Smirnov distance Example (Kolmogorov-Smirnov confidence intervals) We know given α (0, 1), there is a well-defined d = d α,n such that, for any continuous

More information

Lecture 15. Hypothesis testing in the linear model

Lecture 15. Hypothesis testing in the linear model 14. Lecture 15. Hypothesis testing in the linear model Lecture 15. Hypothesis testing in the linear model 1 (1 1) Preliminary lemma 15. Hypothesis testing in the linear model 15.1. Preliminary lemma Lemma

More information

Recall the convention that, for us, all vectors are column vectors.

Recall the convention that, for us, all vectors are column vectors. Some linear algebra Recall the convention that, for us, all vectors are column vectors. 1. Symmetric matrices Let A be a real matrix. Recall that a complex number λ is an eigenvalue of A if there exists

More information

Chapter Two Elements of Linear Algebra

Chapter Two Elements of Linear Algebra Chapter Two Elements of Linear Algebra Previously, in chapter one, we have considered single first order differential equations involving a single unknown function. In the next chapter we will begin to

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression ST 430/514 Recall: A regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates)

More information

Lecture 7: Positive Semidefinite Matrices

Lecture 7: Positive Semidefinite Matrices Lecture 7: Positive Semidefinite Matrices Rajat Mittal IIT Kanpur The main aim of this lecture note is to prepare your background for semidefinite programming. We have already seen some linear algebra.

More information

Stat 579: Generalized Linear Models and Extensions

Stat 579: Generalized Linear Models and Extensions Stat 579: Generalized Linear Models and Extensions Mixed models Yan Lu March, 2018, week 8 1 / 32 Restricted Maximum Likelihood (REML) REML: uses a likelihood function calculated from the transformed set

More information

Formal Statement of Simple Linear Regression Model

Formal Statement of Simple Linear Regression Model Formal Statement of Simple Linear Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters X i is a known constant, the value of the predictor

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression In simple linear regression we are concerned about the relationship between two variables, X and Y. There are two components to such a relationship. 1. The strength of the relationship.

More information

MA 575 Linear Models: Cedric E. Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1

MA 575 Linear Models: Cedric E. Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1 MA 575 Linear Models: Cedric E Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1 1 Within-group Correlation Let us recall the simple two-level hierarchical

More information

Mixed models with correlated measurement errors

Mixed models with correlated measurement errors Mixed models with correlated measurement errors Rasmus Waagepetersen October 9, 2018 Example from Department of Health Technology 25 subjects where exposed to electric pulses of 11 different durations

More information

STAT 263/363: Experimental Design Winter 2016/17. Lecture 1 January 9. Why perform Design of Experiments (DOE)? There are at least two reasons:

STAT 263/363: Experimental Design Winter 2016/17. Lecture 1 January 9. Why perform Design of Experiments (DOE)? There are at least two reasons: STAT 263/363: Experimental Design Winter 206/7 Lecture January 9 Lecturer: Minyong Lee Scribe: Zachary del Rosario. Design of Experiments Why perform Design of Experiments (DOE)? There are at least two

More information

Lecture 3. Inference about multivariate normal distribution

Lecture 3. Inference about multivariate normal distribution Lecture 3. Inference about multivariate normal distribution 3.1 Point and Interval Estimation Let X 1,..., X n be i.i.d. N p (µ, Σ). We are interested in evaluation of the maximum likelihood estimates

More information

Statistics 612: L p spaces, metrics on spaces of probabilites, and connections to estimation

Statistics 612: L p spaces, metrics on spaces of probabilites, and connections to estimation Statistics 62: L p spaces, metrics on spaces of probabilites, and connections to estimation Moulinath Banerjee December 6, 2006 L p spaces and Hilbert spaces We first formally define L p spaces. Consider

More information

Multivariate Statistical Analysis

Multivariate Statistical Analysis Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 9 for Applied Multivariate Analysis Outline Addressing ourliers 1 Addressing ourliers 2 Outliers in Multivariate samples (1) For

More information

3. (a) (8 points) There is more than one way to correctly express the null hypothesis in matrix form. One way to state the null hypothesis is

3. (a) (8 points) There is more than one way to correctly express the null hypothesis in matrix form. One way to state the null hypothesis is Stat 501 Solutions and Comments on Exam 1 Spring 005-4 0-4 1. (a) (5 points) Y ~ N, -1-4 34 (b) (5 points) X (X,X ) = (5,8) ~ N ( 11.5, 0.9375 ) 3 1 (c) (10 points, for each part) (i), (ii), and (v) are

More information

Canonical Correlation Analysis of Longitudinal Data

Canonical Correlation Analysis of Longitudinal Data Biometrics Section JSM 2008 Canonical Correlation Analysis of Longitudinal Data Jayesh Srivastava Dayanand N Naik Abstract Studying the relationship between two sets of variables is an important multivariate

More information

Chapter 6: Orthogonality

Chapter 6: Orthogonality Chapter 6: Orthogonality (Last Updated: November 7, 7) These notes are derived primarily from Linear Algebra and its applications by David Lay (4ed). A few theorems have been moved around.. Inner products

More information

Chapter 8: Hypothesis Testing Lecture 9: Likelihood ratio tests

Chapter 8: Hypothesis Testing Lecture 9: Likelihood ratio tests Chapter 8: Hypothesis Testing Lecture 9: Likelihood ratio tests Throughout this chapter we consider a sample X taken from a population indexed by θ Θ R k. Instead of estimating the unknown parameter, we

More information

Maximum-Likelihood Estimation: Basic Ideas

Maximum-Likelihood Estimation: Basic Ideas Sociology 740 John Fox Lecture Notes Maximum-Likelihood Estimation: Basic Ideas Copyright 2014 by John Fox Maximum-Likelihood Estimation: Basic Ideas 1 I The method of maximum likelihood provides estimators

More information

ANOVA: Analysis of Variance - Part I

ANOVA: Analysis of Variance - Part I ANOVA: Analysis of Variance - Part I The purpose of these notes is to discuss the theory behind the analysis of variance. It is a summary of the definitions and results presented in class with a few exercises.

More information

1 Least Squares Estimation - multiple regression.

1 Least Squares Estimation - multiple regression. Introduction to multiple regression. Fall 2010 1 Least Squares Estimation - multiple regression. Let y = {y 1,, y n } be a n 1 vector of dependent variable observations. Let β = {β 0, β 1 } be the 2 1

More information

Stat 502 Design and Analysis of Experiments General Linear Model

Stat 502 Design and Analysis of Experiments General Linear Model 1 Stat 502 Design and Analysis of Experiments General Linear Model Fritz Scholz Department of Statistics, University of Washington December 6, 2013 2 General Linear Hypothesis We assume the data vector

More information

STAT 100C: Linear models

STAT 100C: Linear models STAT 100C: Linear models Arash A. Amini April 27, 2018 1 / 1 Table of Contents 2 / 1 Linear Algebra Review Read 3.1 and 3.2 from text. 1. Fundamental subspace (rank-nullity, etc.) Im(X ) = ker(x T ) R

More information

Regression diagnostics

Regression diagnostics Regression diagnostics Kerby Shedden Department of Statistics, University of Michigan November 5, 018 1 / 6 Motivation When working with a linear model with design matrix X, the conventional linear model

More information

The linear model is the most fundamental of all serious statistical models encompassing:

The linear model is the most fundamental of all serious statistical models encompassing: Linear Regression Models: A Bayesian perspective Ingredients of a linear model include an n 1 response vector y = (y 1,..., y n ) T and an n p design matrix (e.g. including regressors) X = [x 1,..., x

More information

Statistics 910, #5 1. Regression Methods

Statistics 910, #5 1. Regression Methods Statistics 910, #5 1 Overview Regression Methods 1. Idea: effects of dependence 2. Examples of estimation (in R) 3. Review of regression 4. Comparisons and relative efficiencies Idea Decomposition Well-known

More information

Chapter 3. Vector Spaces and Subspaces. Po-Ning Chen, Professor. Department of Electrical and Computer Engineering. National Chiao Tung University

Chapter 3. Vector Spaces and Subspaces. Po-Ning Chen, Professor. Department of Electrical and Computer Engineering. National Chiao Tung University Chapter 3 Vector Spaces and Subspaces Po-Ning Chen, Professor Department of Electrical and Computer Engineering National Chiao Tung University Hsin Chu, Taiwan 30010, R.O.C. 3.1 Spaces of vectors 3-1 What

More information

y ˆ i = ˆ " T u i ( i th fitted value or i th fit)

y ˆ i = ˆ  T u i ( i th fitted value or i th fit) 1 2 INFERENCE FOR MULTIPLE LINEAR REGRESSION Recall Terminology: p predictors x 1, x 2,, x p Some might be indicator variables for categorical variables) k-1 non-constant terms u 1, u 2,, u k-1 Each u

More information

Classification 2: Linear discriminant analysis (continued); logistic regression

Classification 2: Linear discriminant analysis (continued); logistic regression Classification 2: Linear discriminant analysis (continued); logistic regression Ryan Tibshirani Data Mining: 36-462/36-662 April 4 2013 Optional reading: ISL 4.4, ESL 4.3; ISL 4.3, ESL 4.4 1 Reminder:

More information

1 Data Arrays and Decompositions

1 Data Arrays and Decompositions 1 Data Arrays and Decompositions 1.1 Variance Matrices and Eigenstructure Consider a p p positive definite and symmetric matrix V - a model parameter or a sample variance matrix. The eigenstructure is

More information

Topic 19 Extensions on the Likelihood Ratio

Topic 19 Extensions on the Likelihood Ratio Topic 19 Extensions on the Likelihood Ratio Two-Sided Tests 1 / 12 Outline Overview Normal Observations Power Analysis 2 / 12 Overview The likelihood ratio test is a popular choice for composite hypothesis

More information

Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2

Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2 Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2 Fall, 2013 Page 1 Random Variable and Probability Distribution Discrete random variable Y : Finite possible values {y

More information

Section 6.2, 6.3 Orthogonal Sets, Orthogonal Projections

Section 6.2, 6.3 Orthogonal Sets, Orthogonal Projections Section 6. 6. Orthogonal Sets Orthogonal Projections Main Ideas in these sections: Orthogonal set = A set of mutually orthogonal vectors. OG LI. Orthogonal Projection of y onto u or onto an OG set {u u

More information

Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models Introduction to General and Generalized Linear Models Generalized Linear Models - part II Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs.

More information

5.1 Consistency of least squares estimates. We begin with a few consistency results that stand on their own and do not depend on normality.

5.1 Consistency of least squares estimates. We begin with a few consistency results that stand on their own and do not depend on normality. 88 Chapter 5 Distribution Theory In this chapter, we summarize the distributions related to the normal distribution that occur in linear models. Before turning to this general problem that assumes normal

More information

Linear Regression. September 27, Chapter 3. Chapter 3 September 27, / 77

Linear Regression. September 27, Chapter 3. Chapter 3 September 27, / 77 Linear Regression Chapter 3 September 27, 2016 Chapter 3 September 27, 2016 1 / 77 1 3.1. Simple linear regression 2 3.2 Multiple linear regression 3 3.3. The least squares estimation 4 3.4. The statistical

More information

4.6 Bases and Dimension

4.6 Bases and Dimension 46 Bases and Dimension 281 40 (a) Show that {1,x,x 2,x 3 } is linearly independent on every interval (b) If f k (x) = x k for k = 0, 1,,n, show that {f 0,f 1,,f n } is linearly independent on every interval

More information

One-Way ANOVA Model. group 1 y 11, y 12 y 13 y 1n1. group 2 y 21, y 22 y 2n2. group g y g1, y g2, y gng. g is # of groups,

One-Way ANOVA Model. group 1 y 11, y 12 y 13 y 1n1. group 2 y 21, y 22 y 2n2. group g y g1, y g2, y gng. g is # of groups, One-Way ANOVA Model group 1 y 11, y 12 y 13 y 1n1 group 2 y 21, y 22 y 2n2 group g y g1, y g2, y gng g is # of groups, n i denotes # of obs in the i-th group, and the total sample size n = g i=1 n i. 1

More information

STA 2101/442 Assignment 3 1

STA 2101/442 Assignment 3 1 STA 2101/442 Assignment 3 1 These questions are practice for the midterm and final exam, and are not to be handed in. 1. Suppose X 1,..., X n are a random sample from a distribution with mean µ and variance

More information

Numerical Sequences and Series

Numerical Sequences and Series Numerical Sequences and Series Written by Men-Gen Tsai email: b89902089@ntu.edu.tw. Prove that the convergence of {s n } implies convergence of { s n }. Is the converse true? Solution: Since {s n } is

More information

Statement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:.

Statement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:. MATHEMATICAL STATISTICS Homework assignment Instructions Please turn in the homework with this cover page. You do not need to edit the solutions. Just make sure the handwriting is legible. You may discuss

More information

Bare-bones outline of eigenvalue theory and the Jordan canonical form

Bare-bones outline of eigenvalue theory and the Jordan canonical form Bare-bones outline of eigenvalue theory and the Jordan canonical form April 3, 2007 N.B.: You should also consult the text/class notes for worked examples. Let F be a field, let V be a finite-dimensional

More information

STA 2201/442 Assignment 2

STA 2201/442 Assignment 2 STA 2201/442 Assignment 2 1. This is about how to simulate from a continuous univariate distribution. Let the random variable X have a continuous distribution with density f X (x) and cumulative distribution

More information

22 Approximations - the method of least squares (1)

22 Approximations - the method of least squares (1) 22 Approximations - the method of least squares () Suppose that for some y, the equation Ax = y has no solutions It may happpen that this is an important problem and we can t just forget about it If we

More information

2.3. VECTOR SPACES 25

2.3. VECTOR SPACES 25 2.3. VECTOR SPACES 25 2.3 Vector Spaces MATH 294 FALL 982 PRELIM # 3a 2.3. Let C[, ] denote the space of continuous functions defined on the interval [,] (i.e. f(x) is a member of C[, ] if f(x) is continuous

More information

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Put your solution to each problem on a separate sheet of paper. Problem 1. (5106) Let X 1, X 2,, X n be a sequence of i.i.d. observations from a

More information

Linear Model Under General Variance

Linear Model Under General Variance Linear Model Under General Variance We have a sample of T random variables y 1, y 2,, y T, satisfying the linear model Y = X β + e, where Y = (y 1,, y T )' is a (T 1) vector of random variables, X = (T

More information

MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD. Copyright c 2012 (Iowa State University) Statistics / 30

MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD. Copyright c 2012 (Iowa State University) Statistics / 30 MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD Copyright c 2012 (Iowa State University) Statistics 511 1 / 30 INFORMATION CRITERIA Akaike s Information criterion is given by AIC = 2l(ˆθ) + 2k, where l(ˆθ)

More information

Association studies and regression

Association studies and regression Association studies and regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Association studies and regression 1 / 104 Administration

More information

Tentative solutions TMA4255 Applied Statistics 16 May, 2015

Tentative solutions TMA4255 Applied Statistics 16 May, 2015 Norwegian University of Science and Technology Department of Mathematical Sciences Page of 9 Tentative solutions TMA455 Applied Statistics 6 May, 05 Problem Manufacturer of fertilizers a) Are these independent

More information

18.S096 Problem Set 3 Fall 2013 Regression Analysis Due Date: 10/8/2013

18.S096 Problem Set 3 Fall 2013 Regression Analysis Due Date: 10/8/2013 18.S096 Problem Set 3 Fall 013 Regression Analysis Due Date: 10/8/013 he Projection( Hat ) Matrix and Case Influence/Leverage Recall the setup for a linear regression model y = Xβ + ɛ where y and ɛ are

More information

3 Multiple Linear Regression

3 Multiple Linear Regression 3 Multiple Linear Regression 3.1 The Model Essentially, all models are wrong, but some are useful. Quote by George E.P. Box. Models are supposed to be exact descriptions of the population, but that is

More information

Solution Set 7, Fall '12

Solution Set 7, Fall '12 Solution Set 7, 18.06 Fall '12 1. Do Problem 26 from 5.1. (It might take a while but when you see it, it's easy) Solution. Let n 3, and let A be an n n matrix whose i, j entry is i + j. To show that det

More information

Multivariate Analysis and Likelihood Inference

Multivariate Analysis and Likelihood Inference Multivariate Analysis and Likelihood Inference Outline 1 Joint Distribution of Random Variables 2 Principal Component Analysis (PCA) 3 Multivariate Normal Distribution 4 Likelihood Inference Joint density

More information

Lecture 9. ANOVA: Random-effects model, sample size

Lecture 9. ANOVA: Random-effects model, sample size Lecture 9. ANOVA: Random-effects model, sample size Jesper Rydén Matematiska institutionen, Uppsala universitet jesper@math.uu.se Regressions and Analysis of Variance fall 2015 Fixed or random? Is it reasonable

More information

Randomized Complete Block Designs

Randomized Complete Block Designs Randomized Complete Block Designs David Allen University of Kentucky February 23, 2016 1 Randomized Complete Block Design There are many situations where it is impossible to use a completely randomized

More information

Graph coloring, perfect graphs

Graph coloring, perfect graphs Lecture 5 (05.04.2013) Graph coloring, perfect graphs Scribe: Tomasz Kociumaka Lecturer: Marcin Pilipczuk 1 Introduction to graph coloring Definition 1. Let G be a simple undirected graph and k a positive

More information

The Components of a Statistical Hypothesis Testing Problem

The Components of a Statistical Hypothesis Testing Problem Statistical Inference: Recall from chapter 5 that statistical inference is the use of a subset of a population (the sample) to draw conclusions about the entire population. In chapter 5 we studied one

More information

ECO220Y Simple Regression: Testing the Slope

ECO220Y Simple Regression: Testing the Slope ECO220Y Simple Regression: Testing the Slope Readings: Chapter 18 (Sections 18.3-18.5) Winter 2012 Lecture 19 (Winter 2012) Simple Regression Lecture 19 1 / 32 Simple Regression Model y i = β 0 + β 1 x

More information

First Year Examination Department of Statistics, University of Florida

First Year Examination Department of Statistics, University of Florida First Year Examination Department of Statistics, University of Florida August 20, 2009, 8:00 am - 2:00 noon Instructions:. You have four hours to answer questions in this examination. 2. You must show

More information

Finite Singular Multivariate Gaussian Mixture

Finite Singular Multivariate Gaussian Mixture 21/06/2016 Plan 1 Basic definitions Singular Multivariate Normal Distribution 2 3 Plan Singular Multivariate Normal Distribution 1 Basic definitions Singular Multivariate Normal Distribution 2 3 Multivariate

More information

REVIEW FOR EXAM III SIMILARITY AND DIAGONALIZATION

REVIEW FOR EXAM III SIMILARITY AND DIAGONALIZATION REVIEW FOR EXAM III The exam covers sections 4.4, the portions of 4. on systems of differential equations and on Markov chains, and..4. SIMILARITY AND DIAGONALIZATION. Two matrices A and B are similar

More information

THE UNIVERSITY OF CHICAGO Graduate School of Business Business 41912, Spring Quarter 2008, Mr. Ruey S. Tsay. Solutions to Final Exam

THE UNIVERSITY OF CHICAGO Graduate School of Business Business 41912, Spring Quarter 2008, Mr. Ruey S. Tsay. Solutions to Final Exam THE UNIVERSITY OF CHICAGO Graduate School of Business Business 41912, Spring Quarter 2008, Mr. Ruey S. Tsay Solutions to Final Exam 1. (13 pts) Consider the monthly log returns, in percentages, of five

More information

Appendix A: Review of the General Linear Model

Appendix A: Review of the General Linear Model Appendix A: Review of the General Linear Model The generallinear modelis an important toolin many fmri data analyses. As the name general suggests, this model can be used for many different types of analyses,

More information

Practice Final Exam. December 14, 2009

Practice Final Exam. December 14, 2009 Practice Final Exam December 14, 29 1 New Material 1.1 ANOVA 1. A purication process for a chemical involves passing it, in solution, through a resin on which impurities are adsorbed. A chemical engineer

More information

Lecture 1: Linear Models and Applications

Lecture 1: Linear Models and Applications Lecture 1: Linear Models and Applications Claudia Czado TU München c (Claudia Czado, TU Munich) ZFS/IMS Göttingen 2004 0 Overview Introduction to linear models Exploratory data analysis (EDA) Estimation

More information

Regression: Lecture 2

Regression: Lecture 2 Regression: Lecture 2 Niels Richard Hansen April 26, 2012 Contents 1 Linear regression and least squares estimation 1 1.1 Distributional results................................ 3 2 Non-linear effects and

More information

17: INFERENCE FOR MULTIPLE REGRESSION. Inference for Individual Regression Coefficients

17: INFERENCE FOR MULTIPLE REGRESSION. Inference for Individual Regression Coefficients 17: INFERENCE FOR MULTIPLE REGRESSION Inference for Individual Regression Coefficients The results of this section require the assumption that the errors u are normally distributed. Let c i ij denote the

More information

4 Derivations of the Discrete-Time Kalman Filter

4 Derivations of the Discrete-Time Kalman Filter Technion Israel Institute of Technology, Department of Electrical Engineering Estimation and Identification in Dynamical Systems (048825) Lecture Notes, Fall 2009, Prof N Shimkin 4 Derivations of the Discrete-Time

More information

Topics in linear algebra

Topics in linear algebra Chapter 6 Topics in linear algebra 6.1 Change of basis I want to remind you of one of the basic ideas in linear algebra: change of basis. Let F be a field, V and W be finite dimensional vector spaces over

More information

Testing a Normal Covariance Matrix for Small Samples with Monotone Missing Data

Testing a Normal Covariance Matrix for Small Samples with Monotone Missing Data Applied Mathematical Sciences, Vol 3, 009, no 54, 695-70 Testing a Normal Covariance Matrix for Small Samples with Monotone Missing Data Evelina Veleva Rousse University A Kanchev Department of Numerical

More information

Supplementary Notes March 23, The subgroup Ω for orthogonal groups

Supplementary Notes March 23, The subgroup Ω for orthogonal groups The subgroup Ω for orthogonal groups 18.704 Supplementary Notes March 23, 2005 In the case of the linear group, it is shown in the text that P SL(n, F ) (that is, the group SL(n) of determinant one matrices,

More information