Analysis of variance using orthogonal projections
|
|
- Tyler Griffin
- 5 years ago
- Views:
Transcription
1 Analysis of variance using orthogonal projections Rasmus Waagepetersen Abstract The purpose of this note is to show how statistical theory for inference in balanced ANOVA models can be conveniently developed using orthogonal projections. The approach is exemplified for one- and twoway ANOVA and finally some hints are given regarding extension to three- or higher-way ANOVA. 1 Prerequisites on factors and projections Analysis of variance (ANOVA) models are specified in terms of grouping variables or factors. Suppose observations y i are indexed by a set I of cardinality n. A factor F is a function that assigns a grouping label among a finite set of labels to each observation. E.g. F (i) = q means that observation y i (or index i) is assigned to group/level q for the factor F. Suppose F generates k groups. The design matrix Z F corresponding to F is then n k and the iqth entry of Z F is 1 if i is assigned to group q and 0 otherwise. Note that in many applications, i is a multi-index of the form i = i 1 i 2... i p for p 1. For any factor F we denote by L F the column space of Z F. The orthogonal projection on L F is denoted P F. The result of applying P F to a vector (y i ) i I is that y i is replaced by the average of the observations in the group that i belongs to (i.e. the average of those y l for which F (l) = F (i)). Two factors play a special role. The unit factor I has a unique level for each observation so L I = R n and P I = I (with an abuse of notation I is used both for factor and identity matrix). The factor 0 assigns all observations to the same group so L 0 = span{1 n } and P 0 = 1 n 1 T n/n. Note that P 0 y is simply the vector where each component is given by the average ȳ = i I y i/n. For any factor F, L 0 L F L I. 1
2 A factor F is said to be balanced if there is a common number m of observations at each of the k levels (whereby n = mk). In this case, the orthogonal projection P F on L F is P F = 1 m Z F Z T F. This type of identity is crucial in the following and the reason why we focus on balanced factors. 2 One-way ANOVA Consider the model Y ij = ξ + U i + ɛ ij, i = 1,..., k, j = 1,..., n i, where ξ R, U i N(0, τ 2 ), ɛ ij N(0, σ 2 ), τ 2, σ 2 0, and the variables U i and ɛ ij, are independent i = 1,..., k and j = 1,..., n i. Let I consist of indices ij for i = 1,..., k and j = 1,..., n i, and define the factor F by F (ij) = i. Then stacking variables on top of each other we can write the model in vector form as Y = 1 n ξ + Z F U + ɛ where n is the total number of observations. As noted in the previous section, Z F is the design matrix corresponding to F : the ij, qth entry of Z F is 1 if Y ij belongs to the qth group and zero otherwise. 2.1 Orthogonal decomposition We now obtain an orthogonal decomposition of R n : R n = V 0 V F V I where V 0 = L 0 = span(1 n ), V F = L F V 0 and V I = R n L F. Here denotes sum of orthogonal subspaces: L 1 L 2 = {x + y x L 1, y L 2 } for orthogonal subspaces L 1 and L 2, and denotes orthogonal complement: L 2 L 1 = {x L 2 x T y = 0 y L 1 } 2
3 for L 1 L 2. The dimensions of V 0, V F and V I are 1, k 1 and n k, and the orthogonal projections on V 0, V F and V I are Q 0 = P 0, Q F = P F P 0 and Q I = I P F (see exercise 1). Using the orthogonal projections we also obtain an orthogonal decomposition of the data vector: Y = Q 0 Y + Q F Y + Q I Y into components falling in V 0, V F and V I. The covariance matrix is decomposed as: CovY = CovZ F U + Covɛ = mτ 2 P F + σ 2 I = λp F + σ 2 Q I where λ = mτ 2 + σ 2. Here we used CovZ F U = τ 2 Z F Z T F = τ 2 mp F and I = P F + Q I = 0. Note that there is a one-to-one correspondence between the pairs (λ, σ 2 ) and (τ 2, σ 2 ). The components Q 0 Y, Q F Y and Q I Y are independent since their covariances are zero. For example Cov(Q 0 Y, Q F Y ) = Q 0 ΣQ F = Q 0 (λp F + σ 2 Q I )Q F which is zero since P F = Q 0 +Q F and all products Q 0 Q F = Q 0 Q I = Q F Q I = 0. Since CovY = λp F + σ 2 Q I = CovP F Y + CovQ I Y we also consider in the next section the coarser decomposition Y = P F Y + Q I Y of Y into independent components P F Y and Q I Y falling in L F and V I. 2.2 Estimation using factorization of likelihood Note P F 1 n = Q 0 1 n = 1 n and Q I 1 n = 0. Moreover Q I P F = 0. Hence [ ] (( ) [ ]) PF 1n ξ λpf 0 Y N, Q I 0 n 0 σ 2. Q I We can thus base maximum likelihood estimation of (ξ, λ) on P F Y and maximum likelihood estimation of σ 2 on Q I Y. 3
4 More precisely, Σ 1/2 exp( 1 2 (Y 1 nξ) T Σ 1 (Y 1 n ξ)) = λ k/2 exp( 1 2λ P F Y 1 n ξ 2 ) (σ 2 ) k(m 1)/2 exp( 1 2σ 2 Q IY 2 ) where we used Σ 1 = σ 2 Q I + λ 1 P F and Σ = λ k (σ 2 ) mk k (see exercises 2,3,5). The two factors in the above likelihood are in fact generalized densities of the degenerate normal vectors P F Y and Q I Y (i.e. these vectors are n dimensional but their distributions are concentrated on lowerdimensional subspaces of dimension k and n k). Consider e.g. the factor λ k/2 exp( 1 P 2 F Y 1 n ξ 2 ) involving the parameters λ and ξ. We can maximize this with respect to λ and ξ in exactly the same way as when we previously considered the likelihood of a linear normal model N n (µ, σ 2 I). Since L 0 = span{1 n } we thus obtain 1 n ξ = P 0 P F Y = P 0 Y and ˆλ = P F Y P 0 Y 2 /k = SSB/k. Proceeding in the same way for the second factor (where there is no mean parameter), we obtain ˆσ 2 = Q I Y 2 /(k(m 1)) = SSE/(k(m 1)). Note that Q F Y N n (0, λq F ). By exercise 6, P F Y P 0 Y 2 = Q F Y 2 λχ 2 (k 1) which has mean λ(k 1). Thus ˆλ is biased. An unbiased (REML) estimate is given by λ = P F Y P 0 Y 2 /(k 1) = SSB/(k 1). 3 Two-way analysis of variance Consider now a model with two factors T (treatment) and P (plot) with numbers of levels d T and d P. Moreover let P T be the cross-factor (has d P d T levels - one for each combination of levels of P and T ). More specifically, I consists of indices ptr and the factor mappings are P (ptr) = p, T (ptr) = t, and P T (ptr) = pt. Assume P T is balanced with n P T = m observations for each level. Then P and T are balanced too with numbers of observations n P = md T and n T = md P for each level. 4
5 A two-way ANOVA model with fixed T effects and random P and P T effects is Y ptr = ξ + β t + U p + U pt + ɛ ptr p = 1,..., d P, t = 1,..., d T, r = 1,..., m, where the U p N(0, σ 2 P ), U pt N(0, σ 2 P T ) and ɛ ptr N(0, σ 2 ) are independent. With a convenient abuse of terminology we refer to P and P T as random factors. The random effects U p and U pt can e.g. serve to model soil variation between plots in a field in case of an agricultural experiment. In vector form the model is Y = µ + Z P U P + Z P T U P T + ɛ where µ L T. The vectors Y, U P and U P T are again obtained by stacking, e.g. U P = (U 1,..., U dp ) T. 3.1 Orthogonal decomposition Similar to the one-way ANOVA we define: V 0 = L 0 V T = L T V 0 V P = L P V 0. Since P T is balanced it follows that P T P P = P P P T = P 0 (exercise 7) which implies Q T Q P = 0 where Q T = P T P 0 and Q P = P P P 0. Hence V T and V P are orthogonal and we obtain an orthogonal decomposition of the sum of L P and L T : We further define L P + L T = {v + w v L P, w L T } = V 0 V P V T. V P T = L P T (L P + L T ), V I = R n L P T and obtain the orthogonal decomposition: R n = V 0 V P V T V P T V I. The dimensions of the V spaces are f 0 = 1, f P = d P 1, f T = d T 1, f P T = d P d T d P d T + 1 = (d P 1)(d T 1), f I = n d P d T. The orthogonal projections on the V spaces are : Q 0 = P 0, Q P = P P Q 0, Q T = P T Q 0, Q P T = P P T Q P Q T Q 0 and Q I = I P P T. 5
6 In line with the orthogonal decomposition of R n we also obtain two decompositions of Y into orthogonal components: Y = Q 0 Y + Q P Y + Q T Y + Q P T Y + Q I Y = Q P Y + Q P T Y + Q I Y where Q P = Q 0 + Q P = P P QP T = Q P T + Q T and Q I = Q I. The second decomposition corresponds to a coarser decomposition R n = ṼP ṼP T ṼI into orthogonal subspaces ṼP = V 0 V P = L P, Ṽ P T = V T V P T and Ṽ I = V I associated with each random factor P, P T and I. The coarser decomposition of R n corresponds to a decomposition of the covariance matrix using the Q projection matrices: where CovY = σ 2 P n P P P + σ 2 P T n P T P P T + σ 2 I = λ P QP + λ P T QP T + λ I QI λ P = σ 2 + n P T σ 2 P T + n P σ 2 P (1) λ P T = σ 2 + n P T σ 2 P T (2) λ I = σ 2. The mean vector µ L T = V 0 V T is decomposed as: µ = Q P µ + Q P T µ + Q I µ = Q 0 µ + Q T µ + 0 = µ 0 + µ T where µ 0 L 0 and µ T V T. Note the one-to-one correspondences between the σ 2 s and the λ s and between µ and (µ 0,µ T ). 3.2 Relation to sum-to-zero constraint In the model for the mean vector µ, the parameters ξ and β 1,..., β dt identifiable. This is because are not µ ptr = ξ + β t = (ξ + k) + (β t k) = ξ + β t. 6
7 Hence if we add k to ξ we obtain the same mean vector by just subtracting k from the β t s. One way to enforce identifiability is to require that one β t is zero (i.e. choosing a reference group). Another option is to introduce the sum-to-zero constraint: dt t=1 β t = 0. Note that vectors v V T are of the form v ptr = β t since they lie in L T and they further satisfy the sum-to-zero constraint 1 T nv = d P m d T t=1 β t = 0 due to the orthogonality of V 0 and V T. In other words, enforcing the sum-to-zero constraint on the β t parameters corresponds to the decomposition of µ into a constant vector ξ1 n falling in V 0 and a vector v = (β t ) ptr V T. 3.3 Estimation using factorization of likelihood As for one-way ANOVA, Y is decomposed into independent normal vectors Q P Y N(µ 0, λ P QP ) QP T Y N(µ T, λ P T QP T ) QI Y N(0, λ I QI ). Accordingly, the density for Y factorizes into a product of (generalized) densities for Q P Y, Q P T Y and Q I Y similar to the case of the one-way ANOVA. For instance, the density for Q P Y is proportional to λ d P /2 P exp( 1 2λ P Q P Y µ 0 2 ). Thus in analogy with estimation in the usual linear model Y N n (µ, σ 2 I) we obtain ˆµ 0 = P 0 QP Y = P 0 Y = 1 n Ȳ and Similarly, ˆλ P = Q P Y P 0 Y 2 /d P = Q P Y 2 /d P. ˆµ T = Q T QP T Y = Q T Y, ˆµ = ˆµ 0 + ˆµ T = P T Y, ˆλ P T = Q P T Y Q T Y 2 /(d P d T d P ) = Q P T Y 2 /(d P d T d P ), and ˆλ I = ˆσ 2 = Q I Y 2 /(n d P d T ). 7
8 Since V P and V P T have dimensions f P = d P 1 and f P T = (d P 1)(d T 1) we often use these (cf. exercise 6) denominators instead of d P and d P d T d P for λ P and λ P T thereby obtaining the unbiased estimates λ P = Q P Y 2 /f P λp T = Q P T Y 2 /f P T of λ P and λ P T. Note that the MLE for µ is in fact identical to the MLE in the linear model with µ L T and no random effects Estimation of original variance components Estimates of the original variances σp 2 and σ2 P T can be obtained by simply solving (1) and (2) with respect to these parameters. That is, σ 2 P = ( λ P λ P T )/n P σ 2 P T = ( λ P T σ 2 )/n P T. Note that there is no guarantee that these estimates are non-negative. In fact there could be several reasons for obtaining negative variance estimates: if e.g. the true σp 2 is zero or close to zero it is not unlikely to encounter λ P < λ P T due to sampling variation. observations within a group could be negatively correlated which could be reflected by a negative variance estimate (remember definition of intra-class correlation). data deviates in some way (e.g. outliers) from the assumed model. In general, if a negative variance estimate is obtained, it is recommended (as always) to check carefully whether model assumptions seem valid. If data does not seem to contradict the model, one may consider refitting the model without the random factor for which a negative variance estimate was obtained. If negative correlation within groups of the data is plausible one may look into an alternative model to the assumed linear mixed model. 4 Strata A crucial common point for the one- and two-way ANOVA models in the previous sections is that the factorization of the likelihood is based on an orthogonal decomposition induced by the random factors only. This is due 8
9 P I PxT O T Figure 1: Structure diagram for two-way ANOVA with random effects. to the decomposition of the covariance matrix into terms given by scaled projection matrices for the random factors. The fixed effects are decomposed into components according to V spaces for the fixed factors. These components are then assigned to strata corresponding to Ṽ spaces for the random factors. We say that a factor F is finer than G (or G coarser than F ) if levels of G can be obtained by merging levels of F. We then write G F. The rule for allocating fixed effects to strata is then that F belongs to B strata (where B is a random factor) if B is the coarsest random factor which is finer than F. This is of course assuming that this rule makes sense, i.e. we work with an experimental design where this rule gives a unique allocation of each fixed factor to one random factor. It is for instance required that I is a random factor. Also, for fixed F and random B 1, B 2, we can not have both F B 1 and F B 2 unless either B 1 B 2 or B 2 B 1. To get an overview of allocation to strata it is useful to draw a structure diagram as shown for the two-way ANOVA in Figure 1. In the structure diagram there is an arrow from F to G if G is coarser than F and there is no intermediate factor which is coarser than F and finer than G. In Figure 1 9
10 there are three strata (black, green and red) corresponding to the random factors I, P T and P. Note that here we can not have both P and T random unless O is random too (if O was fixed in this case there would not be a unique allocation of O to random factor strata). We also want to respect the hierarchical principle for the mean structure so P T can not be fixed if P is random. A more general discussion of allocation to strata is given in Section 7. 5 Hypothesis tests and confidence intervals We here exemplify how hypothesis tests are conducted in balanced ANOVAs by considering one- and two-way ANOVAs. 5.1 One-way ANOVA For the one-way ANOVA with random effects consider the F -statistic F = SSB/(k 1) SSE/(k(m 1)) = λ ˆσ 2 σ 2 + mτ 2 σ 2 F (k 1, k(m 1)) = (1 + mγ)f (k 1, k(m 1)) where γ = τ 2 /σ 2 is the signal to noise ratio. Thus with q L and q U e.g. 2.5% and 97.5% quantiles for F (k 1, k(m 1)), we have P (q L F/(1 + mγ) q U ) = 95% P ((F/q U 1)/m γ (F/q L 1)/m) = 95% so that [(F/q U 1)/m, (F/q L 1)/m)] is a 95% confidence interval for γ, see also Remark 5.10 in [1]. One hypothesis is of particular interest: H 0 : τ 2 = 0, i.e. there is no variation between groups. A simple F -test for this is based on the observation τ 2 = 0 γ = 0 σ 2 = λ. 10
11 Thus large values of the F statistic above is critical for H 0 and under H 0, F has a F (k 1, k(m 1)) distribution. Note that the F statistic coincides with the F -test for no group effects in a one-way ANOVA with fixed group effects (which is equivalent to the likelihood ratio test for no group effects in the fixed effects one-way ANOVA). 5.2 Tests for fixed effects in two-way ANOVA For the two-way ANOVA considered in Section 3 the interesting hypothesis regarding fixed effects is H 0 : β 1 = β 2 = = β dt i.e. no fixed group effects. This is equivalent to that µ T = 0 (referring to the notation of Section 3.1). Thus for constructing the likelihood ratio test we only need to consider the factor corresponding to the term Q P T Y N(µ T, λ P T QP T ) which has density λ (d P d T d P )/2 P T exp( 1 2λ P T Q P T Y µ T 2 ). Thus the situation is equivalent to testing µ = 0 for a linear normal model N dp d T d P (µ, λ P T I). We can thus proceed as for a usual linear normal model (without random effects) and obtain the F -test F = Q T Y 2 λ P T F (d T 1, (d P 1)(d T 1)). 5.3 Test for variance components in two-way ANOVA Recall λ I = σ 2, λ P T = σ 2 + n P T σ 2 P T and λ P = σ 2 + n P T σ 2 P T + n P σ 2 P. Hence e.g. σ 2 P T = 0 λ I = λ P T. Thus a natural statistic (but not likelihood-ratio statistic) for testing σ 2 P T = 0 is F = λ P T ˆσ 2 which has an F ((d P 1)(d T 1), n d P T ) distribution if σp 2 T = 0. Big values of F are critical for σp 2 T = 0. Note λ P T = Q P T Y 2 /((d P 1)(d T 1)) so F is identical to the F -statistic for testing the hypothesis of no (fixed) effect of the factor P T in a linear normal model without random effects. 11
12 5.4 Confidence intervals for parameters in two-way ANOVA Regarding the mean vector µ = 1 n ξ + Z T β where β = (β 1,..., β dt ), we can consider confidence intervals for components of ξ + β i of µ or for contrasts δ ij = β i β j corresponding to differences between group means. Here it is easy to derive confidence intervals for the contrast: employing the orthogonal decomposition µ = µ 0 + µ T, the contrasts only involve µ T in the P T stratum. E.g. if c is a vector with c 1i1 = 1, c 1j1 = 1 and zeroes elsewhere then δ ij = β i β j = c T µ = c T µ T. Thus δ ij can be estimated as ˆδ ij = c Tˆµ T = c T Q T QP T Y N(β i β j, λ P T c T Q T c) and we can thus follow the usual route to constructing confidence intervals based on the t-distribution combining the above normal distribution for ˆδ ij and the scaled χ 2 distribution of λ P T with degrees of freedom (d P 1)(d T 1). Note c T Q T QP T Y = c T Q T Y = ȳ i ȳ j and c T Q T c = c T P T c c T P 0 c = c T P T c = (1, 1)(1/n T, 1/n T ) T = 2 n T. Hence the t-statistic becomes ȳ i ȳ j 2 λ P T /n T which has a t((d P 1)(d T 1)) distribution under the null hypothesis β i = β j. You can also get the above results by simply observing that ȳ i ȳ j = β i β j + m U i m U j + 1 ɛ i 1 ɛ j n T n T n T n ( T ) 2 N β i β j, (mσp 2 T + σ 2 ). n T Note here m = n P T so mσ P T + σ 2 = λ P T. Confidence intervals for the λ variance parameters (including σ 2 = λ I ) are straightforward due to the exact scaled χ 2 distributions of their estimates. Confidence intervals for the original variance parameters (other than σ 2 ) are less straightforward since estimates for these are distributed as differences of scaled χ 2 distributions). 12
13 6 Orthogonal decomposition for a K-way ANOVA In this section we first construct an orthogonal decomposition for a balanced three-way ANOVA and then generalize the approach to a K-way balanced ANOVA for arbitrary K 1. We know by now that for a two-sided ANOVA corresponding to factors F 1 and F 2 with F 1 F 2 balanced, there exists a decomposition of R n into orthogonal subspaces V I, V F1 F 2, V F1, V F2 and V 0. Suppose we introduce a third factor F 3 so that F 1 F 2 F 3 is balanced too. Then also {V I, V F1 F 3, V F1, V F3, V 0 } and {V I, V F2 F 3, V F2, V F3, V 0 } are sets of orthogonal subspaces. By analogy with two-way ANOVA with factors F = F 1 F 2 and G = F 3 it also follows that V F1 F 2 and V F3 are orthogonal and similarly for the pairs V F1 F 3, V F2 and V F2 F 3, V F1. It is also easy to see that P F1 F 2 P F2 F 3 = P F2. (3) This implies V F1 F 2 and V F2 F 3 are orthogonal. To see this note that the corresponding projections are Q F1 F 2 = P F1 F 2 P F2 Q F1 and Q F2 F 3 = P F2 F 3 P F2 Q F3. We now check that Q F1 F 2 Q F2 F 3 = 0 from which the required orthogonality follows: Q F1 F 2 Q F2 F 3 = (P F1 F 2 P F2 Q F1 )(P F2 F 3 P F2 Q F3 ) = P F2 P F2 P F1 F 2 Q F3 P F2 + P F2 0 Q F1 P F2 F = 0. Here, P F1 F 2 Q F3 = 0 in analogy with a two-way ANOVA with factors F = F 1 F 2 and G = F 3, and Q F1 P F2 F 3 = 0 by the same argument. Thus V F1 F 2, V F1 F 3 and V F2 F 3 are orthogonal too. Defining V F1 F 2 F 3 = L F1 F 2 F 3 (V 0 V F1 V F2 V F3 V F1 F 2 V F1 F 3 V F2 F 3 ) V I = R n V F1 F 2 F 3 we arrive at the orthogonal decomposition R n = V I A {1,2,3} V l A F l letting V l F l = V 0. The approach for the three-way ANOVA can be generalized to obtain the following main result: 13
14 Theorem 1. Suppose F 1 F K is balanced, K 1. Define, recursively, for A {1,..., K}, V l A F l = L l A F l B A V l B F l with V l F l = V 0 = L 0 and let V I = R n L K l=1 F l. Then we have the orthogonal decomposition R n = V I A {1,...,K} V l A F l. Proof. We need to show that (*) V l A F l and V l B F l are orthogonal whenever A, B {1,..., K}, A B (this is already shown for K = 1, 2, 3). If A B = then (*) follows by analogy to a two-way ANOVA with factors F = l A F l and G = l B F l. If B A we define F = l B F l and G = l A\B F l. Then V l A F l = V F G and the result again follows by analogy to a two-way ANOVA. The case A B is handled similarly. Finally if C = A B is non-empty and neither A B nor B A, let F = l A\C F l, G = l B\C F l, and H = l C F l. Then (*) holds by analogy to a three-way ANOVA. 7 Decomposition into strata In this section we discuss a general decomposition into strata corresponding to factors with random effects. Let D be a set of factors and let B D denote the set of factors with random effects. We assume the factors in D are distinct in the sense that we can not have both F F and F F for different F, F D (otherwise two factors in D would induce the same grouping of the observations). there is an orthogonal decomposition R n = F D V F of R n into a sum of subspaces V F, F D, with associated orthogonal projection matrices Q F. the covariance matrix of the data vector is decomposed as Σ = B B σ 2 Bn B P B 14
15 where for B B, P B = Q F. (4) F D:F B We want Q B s, B B, so that P B = B B:B B Q B (5) where for each B B, Q B is an orthogonal projection on a subspace ṼB where R n = B B Ṽ B and ṼB is a sum of V F subspaces. We want this because then we obtain the decomposition Σ = σbn 2 B P B = λ B QB where λ B = n B σb 2 B B B B B B:B B and hence a decomposition of Y into independent terms Q B Y, B B. The following theorem tells us when we may get what we want. Theorem 2. A necessary and sufficient condition for (5) is that: for all F D there exists a B B such that F B and B B for all other B with F B. If this condition holds we can define Ṽ B = V F and QB = Q F. F D:B(F )=B F D:B(F )=B Remark 1. Note that this condition implies that I has to be in B. This is not a serious restriction since we will always include measurement error in our models. It also implies that the B mentioned in the condition is unique. In the proof below we use the notation B(F ) for the unique B corresponding to F, F D. Proof. We first show the sufficiency. Each of the factors F B on the right hand side of (4) has B(F ) = B for precisely one B B. Hence P B = Q F B B:B B F D:B(F )=B Thus we can define for B B, a subspace and associated orthogonal projection Ṽ B = V F and QB = Q F. F D:B(F )=B F D:B(F )=B 15
16 Each F D belongs to precisely one ṼB so Rn = B BṼB. To show that the condition is necessary note first that to get (5), for each F D we must have V F ṼB for some B B with B B. Moreover by equating (4) and (5) we need F B. Suppose next that B i, i = 1,..., m, m 1, are the elements in B that satisfy F B i, i = 1,..., m. If m 2 assume by contradiction that for each B i there is a B j so that B i B j. Assume without loss of generality that V F ṼB 1 and that B 1 B 2. Since F B 2 we would then by (4) need to include Q B 1 in the sum (5) for P B 2 but this would be a contradiction with the restriction on the sum in (5) when B 1 B 2. 8 Concluding remarks This note essentially gives a simplified exposition of results presented in [3]. A general exposition can also be found in [2]. We here considered the use of orthogonal decompositions only in the context of balanced ANOVAs. However, the same principles apply for other types of linear mixed models allowing for relevant orthogonal decompositions of the sample space. One example is a random coefficient model for the well-known orthodontic data set consisting of data for 27 children each with 4 measurements. Here one can e.g. observe that the MLEs of the mean parameters are the same in the models with and without random child effects. References [1] Henrik Madsen and Poul Thyregod Introduction to general and generalized linear models. Chapman & Hall/CRC Texts in Statistical Science. [2] Jesper Møller Centrale statistiske modeller og likelihood baserede metoder, Department of Mathematical Sciences, Aarhus University. [3] Tue Tjur Analysis of variance models in orthogonal designs, International Statistical Review, 52,
17 A Orthogonal projections Suppose L is a subspace of R n, n 1. Let L = {v R n v w = 0 for all w L} denote its orthogonal complement. Orthogonal decomposition: each x R n has a unique decomposition where u L and v L. x = u + v Orthogonal projection: u and v above are the orthogonal projections p L (x) and p L (x) of x on respectively L and L. Due to orthogonality of u and v, we obtain Pythagoras theorem: x 2 = u 2 + v 2. A few facts regarding orthogonal projections: (a) the orthogonal projection p L : R n L on L is a linear mapping. It is thus given by a unique matrix-transformation p L (x) = P x where P is an n n matrix. (b) the projection matrix P is symmetric (P T = P ) and idempotent (P 2 = P ). (c) conversely, if a matrix Q is symmetric, idempotent and L = colq then Q is the matrix of the orthogonal projection on L. (d) if L = colx and X has full rank then P = X(X T X) 1 X T. (e) if P is the matrix of the orthogonal projection on L then I P is the matrix of the orthogonal projection on L. (f) if L 1 and L 2 are subspaces with L 1 L 2 and associated orthogonal projections P 1 and P 2, then the orthogonal projection on L 2 L 1 is P 2 P 1. Example: the orthogonal projection on a subspace spanned by a single vector v is vt x v 2 x. Example: the orthogonal projection on a subspace spanned by orthogonal vectors v 1,..., v p is p v T i x i=1 v i x. 2 17
18 B Exercises 1. Let L 1 L 2 with orthogonal projections P 1 and P 2. Show that P 2 P 1 is the orthogonal projection on L 2 L Show that an orthogonal projection only has eigen values 1 or For a symmetric matrix A show that the determinant A is the product of A s eigen values. 4. Show Q I P F = 0 5. Let S = ap + bq where P and Q are orthogonal projections with P + Q = I and a, b 0. Show that the eigen values of S are the nonzero eigen values a and b of ap and bq. Show that S 1 = a 1 P +b 1 Q. 6. Show that Y 2 σ 2 χ 2 (d) if Y N(0, σ 2 P ) and P is an orthogonal projection on a subspace of dimension d (hint: use spectral decomposition and the result above regarding the eigen values of P ). 7. Check that P P P T = P 0 when P T is balanced. 8. Check that L T + L P = V 0 V P V T. 18
Outline for today. Two-way analysis of variance with random effects
Outline for today Two-way analysis of variance with random effects Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark Two-way ANOVA using orthogonal projections March 4, 2018 1 /
More informationLinear models. Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark. October 5, 2016
Linear models Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark October 5, 2016 1 / 16 Outline for today linear models least squares estimation orthogonal projections estimation
More informationSimple and Multiple Linear Regression
Sta. 113 Chapter 12 and 13 of Devore March 12, 2010 Table of contents 1 Simple Linear Regression 2 Model Simple Linear Regression A simple linear regression model is given by Y = β 0 + β 1 x + ɛ where
More informationLinear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,
Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,
More informationPrediction. is a weighted least squares estimate since it minimizes. Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark
Prediction Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark March 22, 2017 WLS and BLUE (prelude to BLUP) Suppose that Y has mean β and known covariance matrix V (but Y need not
More informationOutline. Statistical inference for linear mixed models. One-way ANOVA in matrix-vector form
Outline Statistical inference for linear mixed models Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark general form of linear mixed models examples of analyses using linear mixed
More informationRestricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model
Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model Xiuming Zhang zhangxiuming@u.nus.edu A*STAR-NUS Clinical Imaging Research Center October, 015 Summary This report derives
More informationMA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7
MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7 1 Random Vectors Let a 0 and y be n 1 vectors, and let A be an n n matrix. Here, a 0 and A are non-random, whereas y is
More informationCourse topics (tentative) The role of random effects
Course topics (tentative) random effects linear mixed models analysis of variance frequentist likelihood-based inference (MLE and REML) prediction Bayesian inference The role of random effects Rasmus Waagepetersen
More informationWLS and BLUE (prelude to BLUP) Prediction
WLS and BLUE (prelude to BLUP) Prediction Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark April 21, 2018 Suppose that Y has mean X β and known covariance matrix V (but Y need
More informationMaximum Likelihood Estimation
Maximum Likelihood Estimation Merlise Clyde STA721 Linear Models Duke University August 31, 2017 Outline Topics Likelihood Function Projections Maximum Likelihood Estimates Readings: Christensen Chapter
More informationChapter 14. Linear least squares
Serik Sagitov, Chalmers and GU, March 5, 2018 Chapter 14 Linear least squares 1 Simple linear regression model A linear model for the random response Y = Y (x) to an independent variable X = x For a given
More informationSTAT 350: Geometry of Least Squares
The Geometry of Least Squares Mathematical Basics Inner / dot product: a and b column vectors a b = a T b = a i b i a b a T b = 0 Matrix Product: A is r s B is s t (AB) rt = s A rs B st Partitioned Matrices
More informationOutline for today. Maximum likelihood estimation. Computation with multivariate normal distributions. Multivariate normal distribution
Outline for today Maximum likelihood estimation Rasmus Waageetersen Deartment of Mathematics Aalborg University Denmark October 30, 2007 the multivariate normal distribution linear and linear mixed models
More informationTopic 22 Analysis of Variance
Topic 22 Analysis of Variance Comparing Multiple Populations 1 / 14 Outline Overview One Way Analysis of Variance Sample Means Sums of Squares The F Statistic Confidence Intervals 2 / 14 Overview Two-sample
More informationMath 423/533: The Main Theoretical Topics
Math 423/533: The Main Theoretical Topics Notation sample size n, data index i number of predictors, p (p = 2 for simple linear regression) y i : response for individual i x i = (x i1,..., x ip ) (1 p)
More informationIntroduction to General and Generalized Linear Models
Introduction to General and Generalized Linear Models Mixed effects models - Part II Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby
More information7. Dimension and Structure.
7. Dimension and Structure 7.1. Basis and Dimension Bases for Subspaces Example 2 The standard unit vectors e 1, e 2,, e n are linearly independent, for if we write (2) in component form, then we obtain
More informationLinear Models Review
Linear Models Review Vectors in IR n will be written as ordered n-tuples which are understood to be column vectors, or n 1 matrices. A vector variable will be indicted with bold face, and the prime sign
More informationEXERCISE SET 5.1. = (kx + kx + k, ky + ky + k ) = (kx + kx + 1, ky + ky + 1) = ((k + )x + 1, (k + )y + 1)
EXERCISE SET 5. 6. The pair (, 2) is in the set but the pair ( )(, 2) = (, 2) is not because the first component is negative; hence Axiom 6 fails. Axiom 5 also fails. 8. Axioms, 2, 3, 6, 9, and are easily
More informationMultivariate Regression
Multivariate Regression The so-called supervised learning problem is the following: we want to approximate the random variable Y with an appropriate function of the random variables X 1,..., X p with the
More information[y i α βx i ] 2 (2) Q = i=1
Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation
More informationCh 2: Simple Linear Regression
Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component
More informationECON 4160, Autumn term Lecture 1
ECON 4160, Autumn term 2017. Lecture 1 a) Maximum Likelihood based inference. b) The bivariate normal model Ragnar Nymoen University of Oslo 24 August 2017 1 / 54 Principles of inference I Ordinary least
More informationLinear models and their mathematical foundations: Simple linear regression
Linear models and their mathematical foundations: Simple linear regression Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/21 Introduction
More informationPeter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8
Contents 1 Linear model 1 2 GLS for multivariate regression 5 3 Covariance estimation for the GLM 8 4 Testing the GLH 11 A reference for some of this material can be found somewhere. 1 Linear model Recall
More informationMAT2342 : Introduction to Applied Linear Algebra Mike Newman, fall Projections. introduction
MAT4 : Introduction to Applied Linear Algebra Mike Newman fall 7 9. Projections introduction One reason to consider projections is to understand approximate solutions to linear systems. A common example
More informationAsymptotic Statistics-VI. Changliang Zou
Asymptotic Statistics-VI Changliang Zou Kolmogorov-Smirnov distance Example (Kolmogorov-Smirnov confidence intervals) We know given α (0, 1), there is a well-defined d = d α,n such that, for any continuous
More informationLecture 15. Hypothesis testing in the linear model
14. Lecture 15. Hypothesis testing in the linear model Lecture 15. Hypothesis testing in the linear model 1 (1 1) Preliminary lemma 15. Hypothesis testing in the linear model 15.1. Preliminary lemma Lemma
More informationRecall the convention that, for us, all vectors are column vectors.
Some linear algebra Recall the convention that, for us, all vectors are column vectors. 1. Symmetric matrices Let A be a real matrix. Recall that a complex number λ is an eigenvalue of A if there exists
More informationChapter Two Elements of Linear Algebra
Chapter Two Elements of Linear Algebra Previously, in chapter one, we have considered single first order differential equations involving a single unknown function. In the next chapter we will begin to
More informationSimple Linear Regression
Simple Linear Regression ST 430/514 Recall: A regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates)
More informationLecture 7: Positive Semidefinite Matrices
Lecture 7: Positive Semidefinite Matrices Rajat Mittal IIT Kanpur The main aim of this lecture note is to prepare your background for semidefinite programming. We have already seen some linear algebra.
More informationStat 579: Generalized Linear Models and Extensions
Stat 579: Generalized Linear Models and Extensions Mixed models Yan Lu March, 2018, week 8 1 / 32 Restricted Maximum Likelihood (REML) REML: uses a likelihood function calculated from the transformed set
More informationFormal Statement of Simple Linear Regression Model
Formal Statement of Simple Linear Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters X i is a known constant, the value of the predictor
More informationSimple Linear Regression
Simple Linear Regression In simple linear regression we are concerned about the relationship between two variables, X and Y. There are two components to such a relationship. 1. The strength of the relationship.
More informationMA 575 Linear Models: Cedric E. Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1
MA 575 Linear Models: Cedric E Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1 1 Within-group Correlation Let us recall the simple two-level hierarchical
More informationMixed models with correlated measurement errors
Mixed models with correlated measurement errors Rasmus Waagepetersen October 9, 2018 Example from Department of Health Technology 25 subjects where exposed to electric pulses of 11 different durations
More informationSTAT 263/363: Experimental Design Winter 2016/17. Lecture 1 January 9. Why perform Design of Experiments (DOE)? There are at least two reasons:
STAT 263/363: Experimental Design Winter 206/7 Lecture January 9 Lecturer: Minyong Lee Scribe: Zachary del Rosario. Design of Experiments Why perform Design of Experiments (DOE)? There are at least two
More informationLecture 3. Inference about multivariate normal distribution
Lecture 3. Inference about multivariate normal distribution 3.1 Point and Interval Estimation Let X 1,..., X n be i.i.d. N p (µ, Σ). We are interested in evaluation of the maximum likelihood estimates
More informationStatistics 612: L p spaces, metrics on spaces of probabilites, and connections to estimation
Statistics 62: L p spaces, metrics on spaces of probabilites, and connections to estimation Moulinath Banerjee December 6, 2006 L p spaces and Hilbert spaces We first formally define L p spaces. Consider
More informationMultivariate Statistical Analysis
Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 9 for Applied Multivariate Analysis Outline Addressing ourliers 1 Addressing ourliers 2 Outliers in Multivariate samples (1) For
More information3. (a) (8 points) There is more than one way to correctly express the null hypothesis in matrix form. One way to state the null hypothesis is
Stat 501 Solutions and Comments on Exam 1 Spring 005-4 0-4 1. (a) (5 points) Y ~ N, -1-4 34 (b) (5 points) X (X,X ) = (5,8) ~ N ( 11.5, 0.9375 ) 3 1 (c) (10 points, for each part) (i), (ii), and (v) are
More informationCanonical Correlation Analysis of Longitudinal Data
Biometrics Section JSM 2008 Canonical Correlation Analysis of Longitudinal Data Jayesh Srivastava Dayanand N Naik Abstract Studying the relationship between two sets of variables is an important multivariate
More informationChapter 6: Orthogonality
Chapter 6: Orthogonality (Last Updated: November 7, 7) These notes are derived primarily from Linear Algebra and its applications by David Lay (4ed). A few theorems have been moved around.. Inner products
More informationChapter 8: Hypothesis Testing Lecture 9: Likelihood ratio tests
Chapter 8: Hypothesis Testing Lecture 9: Likelihood ratio tests Throughout this chapter we consider a sample X taken from a population indexed by θ Θ R k. Instead of estimating the unknown parameter, we
More informationMaximum-Likelihood Estimation: Basic Ideas
Sociology 740 John Fox Lecture Notes Maximum-Likelihood Estimation: Basic Ideas Copyright 2014 by John Fox Maximum-Likelihood Estimation: Basic Ideas 1 I The method of maximum likelihood provides estimators
More informationANOVA: Analysis of Variance - Part I
ANOVA: Analysis of Variance - Part I The purpose of these notes is to discuss the theory behind the analysis of variance. It is a summary of the definitions and results presented in class with a few exercises.
More information1 Least Squares Estimation - multiple regression.
Introduction to multiple regression. Fall 2010 1 Least Squares Estimation - multiple regression. Let y = {y 1,, y n } be a n 1 vector of dependent variable observations. Let β = {β 0, β 1 } be the 2 1
More informationStat 502 Design and Analysis of Experiments General Linear Model
1 Stat 502 Design and Analysis of Experiments General Linear Model Fritz Scholz Department of Statistics, University of Washington December 6, 2013 2 General Linear Hypothesis We assume the data vector
More informationSTAT 100C: Linear models
STAT 100C: Linear models Arash A. Amini April 27, 2018 1 / 1 Table of Contents 2 / 1 Linear Algebra Review Read 3.1 and 3.2 from text. 1. Fundamental subspace (rank-nullity, etc.) Im(X ) = ker(x T ) R
More informationRegression diagnostics
Regression diagnostics Kerby Shedden Department of Statistics, University of Michigan November 5, 018 1 / 6 Motivation When working with a linear model with design matrix X, the conventional linear model
More informationThe linear model is the most fundamental of all serious statistical models encompassing:
Linear Regression Models: A Bayesian perspective Ingredients of a linear model include an n 1 response vector y = (y 1,..., y n ) T and an n p design matrix (e.g. including regressors) X = [x 1,..., x
More informationStatistics 910, #5 1. Regression Methods
Statistics 910, #5 1 Overview Regression Methods 1. Idea: effects of dependence 2. Examples of estimation (in R) 3. Review of regression 4. Comparisons and relative efficiencies Idea Decomposition Well-known
More informationChapter 3. Vector Spaces and Subspaces. Po-Ning Chen, Professor. Department of Electrical and Computer Engineering. National Chiao Tung University
Chapter 3 Vector Spaces and Subspaces Po-Ning Chen, Professor Department of Electrical and Computer Engineering National Chiao Tung University Hsin Chu, Taiwan 30010, R.O.C. 3.1 Spaces of vectors 3-1 What
More informationy ˆ i = ˆ " T u i ( i th fitted value or i th fit)
1 2 INFERENCE FOR MULTIPLE LINEAR REGRESSION Recall Terminology: p predictors x 1, x 2,, x p Some might be indicator variables for categorical variables) k-1 non-constant terms u 1, u 2,, u k-1 Each u
More informationClassification 2: Linear discriminant analysis (continued); logistic regression
Classification 2: Linear discriminant analysis (continued); logistic regression Ryan Tibshirani Data Mining: 36-462/36-662 April 4 2013 Optional reading: ISL 4.4, ESL 4.3; ISL 4.3, ESL 4.4 1 Reminder:
More information1 Data Arrays and Decompositions
1 Data Arrays and Decompositions 1.1 Variance Matrices and Eigenstructure Consider a p p positive definite and symmetric matrix V - a model parameter or a sample variance matrix. The eigenstructure is
More informationTopic 19 Extensions on the Likelihood Ratio
Topic 19 Extensions on the Likelihood Ratio Two-Sided Tests 1 / 12 Outline Overview Normal Observations Power Analysis 2 / 12 Overview The likelihood ratio test is a popular choice for composite hypothesis
More informationLecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2
Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2 Fall, 2013 Page 1 Random Variable and Probability Distribution Discrete random variable Y : Finite possible values {y
More informationSection 6.2, 6.3 Orthogonal Sets, Orthogonal Projections
Section 6. 6. Orthogonal Sets Orthogonal Projections Main Ideas in these sections: Orthogonal set = A set of mutually orthogonal vectors. OG LI. Orthogonal Projection of y onto u or onto an OG set {u u
More informationIntroduction to General and Generalized Linear Models
Introduction to General and Generalized Linear Models Generalized Linear Models - part II Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs.
More information5.1 Consistency of least squares estimates. We begin with a few consistency results that stand on their own and do not depend on normality.
88 Chapter 5 Distribution Theory In this chapter, we summarize the distributions related to the normal distribution that occur in linear models. Before turning to this general problem that assumes normal
More informationLinear Regression. September 27, Chapter 3. Chapter 3 September 27, / 77
Linear Regression Chapter 3 September 27, 2016 Chapter 3 September 27, 2016 1 / 77 1 3.1. Simple linear regression 2 3.2 Multiple linear regression 3 3.3. The least squares estimation 4 3.4. The statistical
More information4.6 Bases and Dimension
46 Bases and Dimension 281 40 (a) Show that {1,x,x 2,x 3 } is linearly independent on every interval (b) If f k (x) = x k for k = 0, 1,,n, show that {f 0,f 1,,f n } is linearly independent on every interval
More informationOne-Way ANOVA Model. group 1 y 11, y 12 y 13 y 1n1. group 2 y 21, y 22 y 2n2. group g y g1, y g2, y gng. g is # of groups,
One-Way ANOVA Model group 1 y 11, y 12 y 13 y 1n1 group 2 y 21, y 22 y 2n2 group g y g1, y g2, y gng g is # of groups, n i denotes # of obs in the i-th group, and the total sample size n = g i=1 n i. 1
More informationSTA 2101/442 Assignment 3 1
STA 2101/442 Assignment 3 1 These questions are practice for the midterm and final exam, and are not to be handed in. 1. Suppose X 1,..., X n are a random sample from a distribution with mean µ and variance
More informationNumerical Sequences and Series
Numerical Sequences and Series Written by Men-Gen Tsai email: b89902089@ntu.edu.tw. Prove that the convergence of {s n } implies convergence of { s n }. Is the converse true? Solution: Since {s n } is
More informationStatement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:.
MATHEMATICAL STATISTICS Homework assignment Instructions Please turn in the homework with this cover page. You do not need to edit the solutions. Just make sure the handwriting is legible. You may discuss
More informationBare-bones outline of eigenvalue theory and the Jordan canonical form
Bare-bones outline of eigenvalue theory and the Jordan canonical form April 3, 2007 N.B.: You should also consult the text/class notes for worked examples. Let F be a field, let V be a finite-dimensional
More informationSTA 2201/442 Assignment 2
STA 2201/442 Assignment 2 1. This is about how to simulate from a continuous univariate distribution. Let the random variable X have a continuous distribution with density f X (x) and cumulative distribution
More information22 Approximations - the method of least squares (1)
22 Approximations - the method of least squares () Suppose that for some y, the equation Ax = y has no solutions It may happpen that this is an important problem and we can t just forget about it If we
More information2.3. VECTOR SPACES 25
2.3. VECTOR SPACES 25 2.3 Vector Spaces MATH 294 FALL 982 PRELIM # 3a 2.3. Let C[, ] denote the space of continuous functions defined on the interval [,] (i.e. f(x) is a member of C[, ] if f(x) is continuous
More informationPh.D. Qualifying Exam Friday Saturday, January 6 7, 2017
Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Put your solution to each problem on a separate sheet of paper. Problem 1. (5106) Let X 1, X 2,, X n be a sequence of i.i.d. observations from a
More informationLinear Model Under General Variance
Linear Model Under General Variance We have a sample of T random variables y 1, y 2,, y T, satisfying the linear model Y = X β + e, where Y = (y 1,, y T )' is a (T 1) vector of random variables, X = (T
More informationMISCELLANEOUS TOPICS RELATED TO LIKELIHOOD. Copyright c 2012 (Iowa State University) Statistics / 30
MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD Copyright c 2012 (Iowa State University) Statistics 511 1 / 30 INFORMATION CRITERIA Akaike s Information criterion is given by AIC = 2l(ˆθ) + 2k, where l(ˆθ)
More informationAssociation studies and regression
Association studies and regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Association studies and regression 1 / 104 Administration
More informationTentative solutions TMA4255 Applied Statistics 16 May, 2015
Norwegian University of Science and Technology Department of Mathematical Sciences Page of 9 Tentative solutions TMA455 Applied Statistics 6 May, 05 Problem Manufacturer of fertilizers a) Are these independent
More information18.S096 Problem Set 3 Fall 2013 Regression Analysis Due Date: 10/8/2013
18.S096 Problem Set 3 Fall 013 Regression Analysis Due Date: 10/8/013 he Projection( Hat ) Matrix and Case Influence/Leverage Recall the setup for a linear regression model y = Xβ + ɛ where y and ɛ are
More information3 Multiple Linear Regression
3 Multiple Linear Regression 3.1 The Model Essentially, all models are wrong, but some are useful. Quote by George E.P. Box. Models are supposed to be exact descriptions of the population, but that is
More informationSolution Set 7, Fall '12
Solution Set 7, 18.06 Fall '12 1. Do Problem 26 from 5.1. (It might take a while but when you see it, it's easy) Solution. Let n 3, and let A be an n n matrix whose i, j entry is i + j. To show that det
More informationMultivariate Analysis and Likelihood Inference
Multivariate Analysis and Likelihood Inference Outline 1 Joint Distribution of Random Variables 2 Principal Component Analysis (PCA) 3 Multivariate Normal Distribution 4 Likelihood Inference Joint density
More informationLecture 9. ANOVA: Random-effects model, sample size
Lecture 9. ANOVA: Random-effects model, sample size Jesper Rydén Matematiska institutionen, Uppsala universitet jesper@math.uu.se Regressions and Analysis of Variance fall 2015 Fixed or random? Is it reasonable
More informationRandomized Complete Block Designs
Randomized Complete Block Designs David Allen University of Kentucky February 23, 2016 1 Randomized Complete Block Design There are many situations where it is impossible to use a completely randomized
More informationGraph coloring, perfect graphs
Lecture 5 (05.04.2013) Graph coloring, perfect graphs Scribe: Tomasz Kociumaka Lecturer: Marcin Pilipczuk 1 Introduction to graph coloring Definition 1. Let G be a simple undirected graph and k a positive
More informationThe Components of a Statistical Hypothesis Testing Problem
Statistical Inference: Recall from chapter 5 that statistical inference is the use of a subset of a population (the sample) to draw conclusions about the entire population. In chapter 5 we studied one
More informationECO220Y Simple Regression: Testing the Slope
ECO220Y Simple Regression: Testing the Slope Readings: Chapter 18 (Sections 18.3-18.5) Winter 2012 Lecture 19 (Winter 2012) Simple Regression Lecture 19 1 / 32 Simple Regression Model y i = β 0 + β 1 x
More informationFirst Year Examination Department of Statistics, University of Florida
First Year Examination Department of Statistics, University of Florida August 20, 2009, 8:00 am - 2:00 noon Instructions:. You have four hours to answer questions in this examination. 2. You must show
More informationFinite Singular Multivariate Gaussian Mixture
21/06/2016 Plan 1 Basic definitions Singular Multivariate Normal Distribution 2 3 Plan Singular Multivariate Normal Distribution 1 Basic definitions Singular Multivariate Normal Distribution 2 3 Multivariate
More informationREVIEW FOR EXAM III SIMILARITY AND DIAGONALIZATION
REVIEW FOR EXAM III The exam covers sections 4.4, the portions of 4. on systems of differential equations and on Markov chains, and..4. SIMILARITY AND DIAGONALIZATION. Two matrices A and B are similar
More informationTHE UNIVERSITY OF CHICAGO Graduate School of Business Business 41912, Spring Quarter 2008, Mr. Ruey S. Tsay. Solutions to Final Exam
THE UNIVERSITY OF CHICAGO Graduate School of Business Business 41912, Spring Quarter 2008, Mr. Ruey S. Tsay Solutions to Final Exam 1. (13 pts) Consider the monthly log returns, in percentages, of five
More informationAppendix A: Review of the General Linear Model
Appendix A: Review of the General Linear Model The generallinear modelis an important toolin many fmri data analyses. As the name general suggests, this model can be used for many different types of analyses,
More informationPractice Final Exam. December 14, 2009
Practice Final Exam December 14, 29 1 New Material 1.1 ANOVA 1. A purication process for a chemical involves passing it, in solution, through a resin on which impurities are adsorbed. A chemical engineer
More informationLecture 1: Linear Models and Applications
Lecture 1: Linear Models and Applications Claudia Czado TU München c (Claudia Czado, TU Munich) ZFS/IMS Göttingen 2004 0 Overview Introduction to linear models Exploratory data analysis (EDA) Estimation
More informationRegression: Lecture 2
Regression: Lecture 2 Niels Richard Hansen April 26, 2012 Contents 1 Linear regression and least squares estimation 1 1.1 Distributional results................................ 3 2 Non-linear effects and
More information17: INFERENCE FOR MULTIPLE REGRESSION. Inference for Individual Regression Coefficients
17: INFERENCE FOR MULTIPLE REGRESSION Inference for Individual Regression Coefficients The results of this section require the assumption that the errors u are normally distributed. Let c i ij denote the
More information4 Derivations of the Discrete-Time Kalman Filter
Technion Israel Institute of Technology, Department of Electrical Engineering Estimation and Identification in Dynamical Systems (048825) Lecture Notes, Fall 2009, Prof N Shimkin 4 Derivations of the Discrete-Time
More informationTopics in linear algebra
Chapter 6 Topics in linear algebra 6.1 Change of basis I want to remind you of one of the basic ideas in linear algebra: change of basis. Let F be a field, V and W be finite dimensional vector spaces over
More informationTesting a Normal Covariance Matrix for Small Samples with Monotone Missing Data
Applied Mathematical Sciences, Vol 3, 009, no 54, 695-70 Testing a Normal Covariance Matrix for Small Samples with Monotone Missing Data Evelina Veleva Rousse University A Kanchev Department of Numerical
More informationSupplementary Notes March 23, The subgroup Ω for orthogonal groups
The subgroup Ω for orthogonal groups 18.704 Supplementary Notes March 23, 2005 In the case of the linear group, it is shown in the text that P SL(n, F ) (that is, the group SL(n) of determinant one matrices,
More information