I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN
|
|
- June Lyons
- 6 years ago
- Views:
Transcription
1 Comparisons of Two Means Edps/Soc 584 and Psych 594 Applied Multivariate Statistics Carolyn J. Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN c Board of Trustees, University of Illinois Comparisons of Two Means Slide 1 of 68
2 Outline Summary : p variables, 2 matched pairs (i.e., dependent samples): H o : µ 1 µ 2 = δ = 0 Repeated measures designs: 1 variable measured as multiple times: H o : Lµ = 0 Two independent samples: Four Cases of H o : µ 1 = µ 2 Missing data later in the semester Reading: Johnson & Wichern pages Comparisons of Two Means Slide 2 of 68
3 (dependent samples) Paired observations arise in a number of different ways: Every subject (case) responds twice (e.g., pre/post test) (dependent samples) Univariate Case Advantage Multivariate Situation Needed for Statistical Inference Statistical Inference If you Reject Ho : δ = 0 Large Samples Example: The data Plot of the Data Plot of the Data: Cases Connected Plot of the Differences Example: Test δ the Mean SAS for the Last Figure Confidence Region,T 2 & Bonferroni Intervals Another way to calculatet 2 Computations for Full Data Cases may be matched (on relevant variables) and then randomly assigned to one of two treatments. Naturally occurring pairs: husbands/wifes, siblings, etc. The plan: Review univariate and then generalize to the multivariate situation. For j = 1,...,n (number of pairs), let X j1 = measurement (response) of the j th case given treatment 1. X j2 = measurement (response) of the j th case given treatment 2. We want to examine the differences D j = X j1 X j2 Comparisons of Two Means Slide 3 of 68
4 Univariate Case D j = X j1 X j2 (dependent samples) Univariate Case Advantage Multivariate Situation Needed for Statistical Inference Statistical Inference If you Reject Ho : δ = 0 Large Samples Example: The data Plot of the Data Plot of the Data: Cases Connected Plot of the Differences Example: Test δ the Mean SAS for the Last Figure Confidence Region,T 2 & Bonferroni Intervals Another way to calculatet 2 Computations for Full Data If D j N(δ,σD 2 ), then the statistic t = D δ s D / n Student s t distribution where D = (1/n) n j=1 D j = (1/n) n j=1 (X j1 X j2 ) s 2 D = (1/(n 1)) n j=1 (D j D) 2 Test H o : δ = 0 versus H A : δ 0 (or H o : δ = δ o versus H A : δ δ o ). A 100(1 α)% confidence interval (estimate) of δ D ±t n 1 (α/2) sd n Comparisons of Two Means Slide 4 of 68
5 Advantage The advantage of looking at differences using paired... (dependent samples) Univariate Case Advantage Multivariate Situation Needed for Statistical Inference Statistical Inference If you Reject Ho : δ = 0 Large Samples Example: The data Plot of the Data Plot of the Data: Cases Connected Plot of the Differences Example: Test δ the Mean SAS for the Last Figure Confidence Region,T 2 & Bonferroni Intervals Another way to calculatet 2 Computations for Full Data It eliminates effects of case-to-case variation, because the variance (standard deviation) of differences is reduced to the extent that the scores/measurements are positively correlated σ 2 D = σ 2 X 1 +σ 2 X 2 2σ X1,X 2 This result comes from what we know about linear combinations: ( ) so D = a X = (1, 1) X 1 X 2 = X 1 X 2 µ D = a µ var(d) = a Σa where µ 2 1 is the mean vector for X and Σ 2 2 covariance matrix for X. Comparisons of Two Means Slide 5 of 68
6 Multivariate Situation (dependent samples) Univariate Case Advantage Multivariate Situation Needed for Statistical Inference Statistical Inference If you Reject Ho : δ = 0 Large Samples Example: The data Plot of the Data Plot of the Data: Cases Connected Plot of the Differences Example: Test δ the Mean SAS for the Last Figure Confidence Region,T 2 & Bonferroni Intervals Another way to calculatet 2 Computations for Full Data Record p variables for each treatment (condition) for each member of each pair. For case j, we have X 1j1 = variable 1, treatment 1 X 2j1 = variable 1, treatment 2 X 1j2 = variable 2, treatment 1 X 2j2 = variable 2, treatment 2. X 1jp = variable p, treatment 1 X 2jp = variable p, treatment 2 where j = 1,...,n (n = the number of pairs that we have). We Study the differences D j1 = X 1j1 X 2j1 D j2 = X 1j2 X 2j2. D jp = X 1jp X 2jp. D j = D j1 D j2. D jp Comparisons of Two Means Slide 6 of 68
7 Needed for Statistical Inference (dependent samples) Univariate Case Advantage Multivariate Situation Needed for Statistical Inference Statistical Inference If you Reject Ho : δ = 0 Large Samples Example: The data Plot of the Data Plot of the Data: Cases Connected Plot of the Differences Example: Test δ the Mean SAS for the Last Figure Confidence Region,T 2 & Bonferroni Intervals Another way to calculatet 2 Computations for Full Data Assume the D j N p (δ,σ D ) and i.i.d. for j = 1,...,J where δ = δ 1 δ 2.. δ p = E(D j) If the differences D 1,D 2,...,D n are a random sample from a N p (δ,σ D ) population, then T 2 = n( D δ) S 1 ( D δ) (n 1)p n p F p,n p Modification for Large Samples: If n and (n-p) are large, then T 2 is approximately distributed as a χ 2 p random variable regardless of the distribution of D j (i.e., D j may not be multivariate normal, but δ and Σ 1 D exist). Comparisons of Two Means Slide 7 of 68
8 Statistical Inference Suppose that we have observations d j = (d j1,d j2,...,d jp for j = 1,...,n). (dependent samples) Univariate Case Advantage Multivariate Situation Needed for Statistical Inference Statistical Inference If you Reject Ho : δ = 0 Large Samples Example: The data Plot of the Data Plot of the Data: Cases Connected Plot of the Differences Example: Test δ the Mean SAS for the Last Figure Confidence Region,T 2 & Bonferroni Intervals Another way to calculatet 2 Computations for Full Data Descriptive statistics: d p 1 = 1 n n j=1 Hypothesis Test: d j and S d,(p p) = 1 n 1 n (d j d)(d j d) j=1 H o : δ = 0 versus H A : δ 0... assuming D j N p (δ,σ D ) and i.i.d. Reject H o if T 2 = n d S 1 d (n 1)p n p F p,n p(α) Comparisons of Two Means Slide 8 of 68
9 If you Reject H o : δ = 0 (dependent samples) Univariate Case Advantage Multivariate Situation Needed for Statistical Inference Statistical Inference If you Reject Ho : δ = 0 Large Samples Example: The data Plot of the Data Plot of the Data: Cases Connected Plot of the Differences Example: Test δ the Mean SAS for the Last Figure Confidence Region,T 2 & Bonferroni Intervals Another way to calculatet 2 Computations for Full Data Confidence Region: n( D δ) S 1 ( D δ) (n 1)p n p F p,n p(α) Simultaneous T 2 Intervals for individual differences of components means (n 1)p δ i : di ± n p F p,n p(α) s 2 d i /n where d i is mean difference of the i th variable and s 2 d i is the i th diagonal element of S d. Bonferroni 100(1 α)% confidence intervals δ i : di ±t n 1 (α/2m) s 2 d i /n where m = the number of confidence intervals (). Comparisons of Two Means Slide 9 of 68
10 Large Samples (dependent samples) Univariate Case Advantage Multivariate Situation Needed for Statistical Inference Statistical Inference If you Reject Ho : δ = 0 Large Samples Example: The data Plot of the Data Plot of the Data: Cases Connected Plot of the Differences Example: Test δ the Mean SAS for the Last Figure For Large (n p) (i.e., D j need not be multivariate normal) (n 1)p n p F p,n p(α) χ 2 p(α) Confidence Region,T 2 & Bonferroni Intervals Another way to calculatet 2 Computations for Full Data Comparisons of Two Means Slide 10 of 68
11 Example: The data Data from Table 5.9, page of Rencher (2007): (dependent samples) Univariate Case Advantage Multivariate Situation Needed for Statistical Inference Statistical Inference If you Reject Ho : δ = 0 Large Samples Example: The data Plot of the Data Plot of the Data: Cases Connected Plot of the Differences Example: Test δ the Mean SAS for the Last Figure Confidence Region,T 2 & Bonferroni Intervals Another way to calculatet 2 Computations for Full Data "Each of 15 students wrote an informal and a formal essay (Kramer, 1972, p100). The variables were recorded were the number of words and number of verbs" y1 = words in informal essay y2 = verbs in informal essay y3 = words in formal essay y4 = verbs in formal essay These are count data. CLT kick-in? n = 15 smallish Sample Statistics: Difference: d =words [verbs] informal words [verbs] formal. ( ) ( ) words d = S = 3.53 verbs Comparisons of Two Means Slide 11 of 68
12 Plot of the Data (dependent samples) Univariate Case Advantage Multivariate Situation Needed for Statistical Inference Statistical Inference If you Reject Ho : δ = 0 Large Samples Example: The data Plot of the Data Plot of the Data: Cases Connected Plot of the Differences Example: Test δ the Mean SAS for the Last Figure Confidence Region,T 2 & Bonferroni Intervals Another way to calculatet 2 Computations for Full Data Comparisons of Two Means Slide 12 of 68
13 Plot of the Data: Cases Connected (dependent samples) Univariate Case Advantage Multivariate Situation Needed for Statistical Inference Statistical Inference If you Reject Ho : δ = 0 Large Samples Example: The data Plot of the Data Plot of the Data: Cases Connected Plot of the Differences Example: Test δ the Mean SAS for the Last Figure Confidence Region,T 2 & Bonferroni Intervals Another way to calculatet 2 Computations for Full Data Comparisons of Two Means Slide 13 of 68
14 Plot of the Differences (dependent samples) Univariate Case Advantage Multivariate Situation Needed for Statistical Inference Statistical Inference If you Reject Ho : δ = 0 Large Samples Example: The data Plot of the Data Plot of the Data: Cases Connected Plot of the Differences Example: Test δ the Mean SAS for the Last Figure Confidence Region,T 2 & Bonferroni Intervals Another way to calculatet 2 Computations for Full Data Comparisons of Two Means Slide 14 of 68
15 Example: Test H o : δ = 0 versus H A : δ 0 (dependent samples) Univariate Case Advantage Multivariate Situation Needed for Statistical Inference Statistical Inference If you Reject Ho : δ = 0 Large Samples Example: The data Plot of the Data Plot of the Data: Cases Connected Plot of the Differences Example: Test δ the Mean SAS for the Last Figure Confidence Region,T 2 & Bonferroni Intervals Another way to calculatet 2 Computations for Full Data (i.e., the number of words and verbs in informal and formal essays are the same). ( ) 1 ( T = 15 (32.80,3.53) ( ) = 15 (32.80, 3.53) = (14(2)/13)F 2,13 (.05) = 8.20 Alternatively, (13)/((14)2)T 2 = 7.053, which is distributed as F 2,13, and has a p-value of =.008 Conclusion: Reject H o. The data support the conclusion that the number of words and verbs in informal essays are not equal to the number in formal ones. ) Comparisons of Two Means Slide 15 of 68
16 95% Confidence Region for δ From SAS>Solutions>Interactive Data Analysis Analyze > Multivariate (scatter plot, curves) (dependent samples) Univariate Case Advantage Multivariate Situation Needed for Statistical Inference Statistical Inference If you Reject Ho : δ = 0 Large Samples Example: The data Plot of the Data Plot of the Data: Cases Connected Plot of the Differences Example: Test δ the Mean SAS for the Last Figure Confidence Region,T 2 & Bonferroni Intervals Another way to calculatet 2 Computations for Full Data Comparisons of Two Means Slide 16 of 68
17 95% Confidence Region for the Mean (dependent samples) Univariate Case Advantage Multivariate Situation Needed for Statistical Inference Statistical Inference If you Reject Ho : δ = 0 Large Samples Example: The data Plot of the Data Plot of the Data: Cases Connected Plot of the Differences Example: Test δ the Mean SAS for the Last Figure Confidence Region,T 2 & Bonferroni Intervals Another way to calculatet 2 Computations for Full Data Comparisons of Two Means Slide 17 of 68
18 SAS for the Last Figure (dependent samples) Univariate Case Advantage Multivariate Situation Needed for Statistical Inference Statistical Inference If you Reject Ho : δ = 0 Large Samples Example: The data Plot of the Data Plot of the Data: Cases Connected Plot of the Differences Example: Test δ the Mean SAS for the Last Figure proc sgscatter data=essay; compare y= dverbs x= dwords / ellipse=(type=mean) ; title 95% Confidence Region for the mean Difference ; run; Confidence Region,T 2 & Bonferroni Intervals Another way to calculatet 2 Computations for Full Data Comparisons of Two Means Slide 18 of 68
19 Confidence Region, T 2 & Bonferroni Intervals (dependent samples) Verbs Univariate Case Advantage Multivariate Situation Needed for Statistical Inference Statistical Inference If you Reject Ho : δ = 0 d = (32.80,3.53) Large Samples Example: The data Plot of the Data Plot of the Data: Cases Connected Words Plot of the Differences Example: Test δ the Mean ր δ o = (0,0) SAS for the Last Figure Confidence Region,T 2 & Bonferroni Intervals Another way to calculatet 2 Computations for Full Data Comparisons of Two Means Slide 19 of 68
20 Another way to calculate T 2 for paired. (dependent samples) Univariate Case Advantage Multivariate Situation Needed for Statistical Inference Statistical Inference If you Reject Ho : δ = 0 Large Samples Example: The data Plot of the Data Plot of the Data: Cases Connected Plot of the Differences Example: Test δ the Mean SAS for the Last Figure Confidence Region,T 2 & Bonferroni Intervals Another way to calculatet 2 Computations for Full Data So far we ve divided the sample ; that is, D = X 1 X 2. Now we ll consider a Full Sample method that considers every case as a pair and each with p measures on each member of the pair. Pair or Case Number Conditon 1 2 j n (a) (b) p variables p variables p variables p variables p variables p variables p variables p variables So we have 2p variables measured for each case (pair). In an experimental situation, the conditions are assumed to have been randomly assigned to members of the pairs. Comparisons of Two Means Slide 20 of 68
21 (dependent samples) Univariate Case Advantage Multivariate Situation Needed for Statistical Inference Statistical Inference If you Reject Ho : δ = 0 Large Samples Example: The data Plot of the Data Plot of the Data: Cases Connected Plot of the Differences Example: Test δ the Mean SAS for the Last Figure Confidence Region,T 2 & Bonferroni Intervals Another way to calculatet 2 Computations for Full Data Full Data Method for paired Full Data Matrix: X 111 X 112 X 11p X 121 X 122 X 12p X 211 X 212 X 21p X 221 X 222 X 22p X n 2p = X n11 X n12 X n1p X n21 X n22 X n2p = (X }{{} 1 X 2 ) }{{} n p n p Full Sample Mean Vector: X = ( X 11, X 12,..., X 1p X 21,..., X 2p ) = ( X 1 X 2) Comparisons of Two Means Slide 21 of 68
22 (dependent samples) Univariate Case Advantage Multivariate Situation Needed for Statistical Inference Statistical Inference If you Reject Ho : δ = 0 Large Samples Example: The data Plot of the Data Plot of the Data: Cases Connected Plot of the Differences Example: Test δ the Mean SAS for the Last Figure Confidence Region,T 2 & Bonferroni Intervals Another way to calculatet 2 Computations for Full Data Full Data Method for paired Full Data Sample Covariance Matrix: ( S 11 S 12 S 2p 2p = S 21 S 22 where S 11 is the (p p) covariance matrix for X 1 S 22 is the (p p) covariance matrix for X 2 S 12 = S 21 is the (p p) covariance matrix between X 1 & X 2. Define a Contrast Matrix: C p 2p = Comparisons of Two Means Slide 22 of 68 ) = (I p p I p p ) What condition do you need to have a contrast matrix?
23 Computations for Full Data Let x j,(2p 1) = j th row of X (n 2p) written as a column vector. (dependent samples) Univariate Case Advantage Multivariate Situation Needed for Statistical Inference Statistical Inference If you Reject Ho : δ = 0 Large Samples Example: The data Plot of the Data Plot of the Data: Cases Connected Plot of the Differences Example: Test δ the Mean SAS for the Last Figure Confidence Region,T 2 & Bonferroni Intervals Another way to calculatet 2 Computations for Full Data d j = Cx j d = C x = C((1/n) n j=1 x j) Putting all of this together yields T 2 = n(c x) (CSC ) 1 (C x) = n x C (CSC ) 1 C x With this method, we don t have to split the data set and compute the differences. We ll see more uses of contrast matrices.... relatively soon. SAS/IML code for essay example. Comparisons of Two Means Slide 23 of 68
24 for comparing conditions (treatments, etc). This is another generalization of univariate paired t test. as a Multivariate Test Contrast Matrices Hypothesis and Test for (Scatter) Plot of the Calculator Data Input 1 from SAS/IML Input continued Output 1 from SAS/IML Output 1 continued Using Contrast Matrix 2 T 2 and Repeated Measures ANOVA vs multivariate T 2 Summary Situation: q conditions are compared with respect to one response variable. Each case receives each treatment once over successive periods of time. The order of the treatments should be randomized (& counterbalanced if possible). Example from Cochran & Cox (1957) (I got this from Timm 1980): There are four calculator designs and each person does specified computations. Their speed is recorded for each of the four calculators. The order of the calculator use was randomly assigned. This is Repeated measures because each case (person) gets each treatment (calculator)... we have repeated observations or measurements on each case. Comparisons of Two Means Slide 24 of 68
25 as a Multivariate Test Contrast Matrices Hypothesis and Test for (Scatter) Plot of the Calculator Data Input 1 from SAS/IML Input continued Output 1 from SAS/IML Output 1 continued Using Contrast Matrix 2 T 2 and Repeated Measures ANOVA vs multivariate T 2 Summary Let the j th observation equal x j = x j1 x j2.. x jq j = 1,...,n where x ji = response or measurement of the i th treatment on the j th case. Question (hypothesis): Is there a treatment effect? versus H o : µ 1 = µ 2 = = µ q H A : Not H o This is the same hypothesis test in univariate, repeated measures ANOVA. Comparisons of Two Means Slide 25 of 68
26 as a Multivariate Test as a Multivariate Test Contrast Matrices Hypothesis and Test for (Scatter) Plot of the Calculator Data Input 1 from SAS/IML Input continued Output 1 from SAS/IML Output 1 continued Using Contrast Matrix 2 T 2 and Repeated Measures ANOVA vs multivariate T 2 Summary To test this as a multivariate mean vector, we need to use contrasts of the components of µ, Assume X j N q (µ,σ). Set up a contrast µ 1 µ 2 µ 1 µ 2 =. µ 1 µ q }{{} (q 1) 1 µ = E(x j ) = µ 1 µ 2.. µ q } {{} (q 1) q µ 1 µ 2.. µ q }{{} q 1 = C 1 µ So H o : C 1 µ = 0. (no treatment effect). Comparisons of Two Means Slide 26 of 68
27 Contrast Matrices Any contrast matrix of size (q 1) q will do. as a Multivariate Test Contrast Matrices Hypothesis and Test for (Scatter) Plot of the Calculator Data Input 1 from SAS/IML Input continued Output 1 from SAS/IML Output 1 continued Using Contrast Matrix 2 T 2 and Repeated Measures ANOVA vs multivariate T 2 For example, C 2 µ = } {{} (q 1) q To be a contrast matrix, The rows are linearly independent. µ 1 µ 2.. µ q }{{} q 1 = µ 1 µ 2 µ 2 µ 3. µ q 1 µ q Each row is a contrast vector. Summary Comparisons of Two Means Slide 27 of 68
28 Hypothesis and Test for as a Multivariate Test Contrast Matrices Hypothesis and Test for (Scatter) Plot of the Calculator Data Input 1 from SAS/IML Input continued Output 1 from SAS/IML Output 1 continued Using Contrast Matrix 2 The hypothesis of no effects due to treatment in a repeated measures design H o : µ 1 = µ 2 = µ q is the same as performing Hotelling s T 2 of H o : Cµ = 0 where C is a (q 1) q contrast matrix Given data x 1,x 2,...,x n and a contrast matrix C, the T 2 test statistic equals T 2 = nc x(csc ) 1 C x T 2 and Repeated Measures ANOVA vs multivariate T 2 Summary Reject H o if T 2 > (n 1)(q 1) n q +1 F (q 1),(n q+1) (α) Now for our example... Plot data and then SAS/IML Comparisons of Two Means Slide 28 of 68
29 (Scatter) Plot of the Calculator Data as a Multivariate Test Contrast Matrices Hypothesis and Test for (Scatter) Plot of the Calculator Data Input 1 from SAS/IML Input continued Output 1 from SAS/IML Output 1 continued Using Contrast Matrix 2 T 2 and Repeated Measures ANOVA vs multivariate T 2 Summary Comparisons of Two Means Slide 29 of 68
30 Input 1 from SAS/IML proc iml; * A Module that computes Hotellings Tˆ2 for one sample tests; as a Multivariate Test Contrast Matrices Hypothesis and Test for (Scatter) Plot of the Calculator Data Input 1 from SAS/IML Input continued Output 1 from SAS/IML Output 1 continued Using Contrast Matrix 2 T 2 and Repeated Measures ANOVA vs multivariate T 2 Summary start Tsq(X,muo,Ts,pvalue); n=nrow(x); one=j(n,1); Xbar = X *one/n; XbarM = one*xbar ; S=(X - XbarM) *(X - XbarM)/(n-1); Ts=n*(xbar-muo) *inv(s)*(xbar-muo); p=ncol(x); dfden=n-1; F=((n-1)*p/(n-p))*Ts; pvalue = 1 - cdf( F,F,p,dfden); finish Tsq; Comparisons of Two Means Slide 30 of 68
31 Input continued as a Multivariate Test Contrast Matrices Hypothesis and Test for (Scatter) Plot of the Calculator Data Input 1 from SAS/IML Input continued Output 1 from SAS/IML Output 1 continued Using Contrast Matrix 2 T 2 and Repeated Measures ANOVA vs multivariate T 2 Summary X={ , , , , }; C1={ , , }; muo={0, 0, 0}; X1 = X*C1 ; run stats(x1,n1,xbar1,w1,s1); run Tsq(X1,muo,Tsq1,pvalue1); Comparisons of Two Means Slide 31 of 68
32 Output 1 from SAS/IML Data matrix (5 subjects x 4 variables) = X as a Multivariate Test Contrast Matrices Hypothesis and Test for (Scatter) Plot of the Calculator Data Input 1 from SAS/IML Input continued Output 1 from SAS/IML Output 1 continued Using Contrast Matrix 2 T 2 and Repeated Measures ANOVA vs multivariate T C1 Using C1: Summary Comparisons of Two Means Slide 32 of 68
33 Output 1 continued as a Multivariate Test Contrast Matrices Hypothesis and Test for (Scatter) Plot of the Calculator Data Input 1 from SAS/IML Input continued Output 1 from SAS/IML Output 1 continued Using Contrast Matrix 2 X*C1 = XBAR1 mean of C1*X1 = TSQ1 PVALUE1 T 2 and Repeated Measures ANOVA vs multivariate T 2 Tˆ2 for C1*mu=0 ----> with p-value = Summary Comparisons of Two Means Slide 33 of 68
34 Using Contrast Matrix 2 as a Multivariate Test Contrast Matrices Hypothesis and Test for (Scatter) Plot of the Calculator Data Input 1 from SAS/IML Input continued Output 1 from SAS/IML Output 1 continued Using Contrast Matrix 2 T 2 and Repeated Measures ANOVA vs multivariate T 2 C2={ , , }; X2 = X*C2 ; run stats(x2,n2,xbar2,w2,s2); run Tsq(X2,muo,Tsq2,pvalue2); (Partial) Output from this: XBAR2 mean of C2*X2 = TSQ2 PVALUE2 Tˆ2 for C2*mu=0 ----> with p-value = Summary With different contrast matrices, we get different C x vectors, but T 2, p value, and conclusions are exactly the same. Comparisons of Two Means Slide 34 of 68
35 T 2 and as a Multivariate Test Contrast Matrices Hypothesis and Test for (Scatter) Plot of the Calculator Data Input 1 from SAS/IML Input continued Output 1 from SAS/IML Output 1 continued Using Contrast Matrix 2 T 2 and Repeated Measures ANOVA vs multivariate T 2 Summary As before (1 α)% Confidence region which consists of all Cµ s such that n(c x Cµ) (CSC ) 1 (C x Cµ) (n 1)(q 1) (n q +1) F (q 1),(n q+1)(α) And Simultaneous T 2 intervals for a single contrast c i x where c i is the ith row of matrix C, c i x± (n 1)(q 1) c (n q +1) F i Sc i (q 1),(n q+1)(α) n }{{} For Bonferroni (or one-at-time) confidence intervals, replace statistic above the brace by appropriate value from the t n 1 distribution. For large n, can use χ 2 q 1. Comparisons of Two Means Slide 35 of 68
36 ANOVA vs multivariate T 2 as a Multivariate Test Contrast Matrices Hypothesis and Test for (Scatter) Plot of the Calculator Data Input 1 from SAS/IML Input continued Output 1 from SAS/IML Output 1 continued Using Contrast Matrix 2 T 2 and Repeated Measures ANOVA vs multivariate T 2 Summary The multivariate T 2 is appropriate for situations where we cannot assume that the covariance matrix for X has a particular structure. With repeated measures ANOVA you must assume that Σ X has a special structure, in particular spherical, σ 2 τ τ τ σ 2 τ Σ X = τ τ σ 2 Unlikely but this works too: Σ = σ 2 I. If the assumptions on the structure of Σ are met, then repeated measures ANOVA is more powerful than multivariate T 2 because the repeated measures ANOVA takes the structure of Σ into account. If assumptions on Σ not met, T 2 is still valid but not repeated measures ANOVA. Comparisons of Two Means Slide 36 of 68
37 Assumptions Case 1: KnownΣ 1 and Σ 2 Case 2: Σ 1 andσ 2 Unknown Distribution of Linear Combination Distribution of Test Statistic Example: Two Independent SamplesT 2 Example continued 100(1 α)% Confidence Region for µ 1 µ 2 Example:Confidence Region Computing the Axes of the Ellipse Figure of 95% Confidence Region Situation: Two samples, each having p measurements where we have a random sample of size n 1 from population 1 and a random sample of size n 2 from population 2. Sample from population 1 Sample from population 2 {}}{{}}{ X 11,X 12,...,X 1n1 X 21,X 22,...,X 2n2 S 1 = 1 n 1 1 x 1 = 1 n 1 n 1 n 1 j=1 j=1 Sample Means x 1j x 2 = 1 n 1 n 2 Sample Covariance matrices j=1 (x 1j x 1 )(x 1j x 1 ) S 2 = 1 n 2 1 x 2j n 2 j=1 (x 2j x 2 )(x 2j x 2 ) Simultaneous T 2 Intervals Simultaneous T 2 continued intervals Bonferroni and One-at-a-Time Intervals Comparisons of Two Means Slide 37 of 68 and Bonferroni Hypothesis: H o : µ 1 = µ 2
38 Assumptions Case 1: KnownΣ 1 and Σ 2 Case 2: Σ 1 andσ 2 Unknown Distribution of Linear Combination Distribution of Test Statistic Example: Two Independent SamplesT 2 Example continued 100(1 α)% Confidence Region for µ 1 µ 2 Example:Confidence Region Computing the Axes of the Ellipse Figure of 95% Confidence Region Simultaneous T 2 Intervals Simultaneous T 2 continued intervals Bonferroni and One-at-a-Time Intervals Comparisons of Two Means Slide 38 of 68 and Bonferroni Assumptions 1. The sample X 11,X 12,...,X 1n1 is a random sample of size n 1 from a p variate population with mean vector µ 1 and covariance matrix Σ The sample X 21,X 22,...,X 2n1 is a random sample of size n 2 from a p variate population with mean vector µ 2 and covariance matrix Σ The samples are (statistically) independent of each other. These assumptions are required when we want to test H o : µ 1 = µ 2 or equivalently µ 1 µ 2 = 0 H A : µ 1 µ 2 or equivalently µ 1 µ 2 0 If n 1 and/or n 2 are small, then we must make two additional assumptions: 4. Both populations are multivariate normal. 5. Σ 1 = Σ 2 This is a very strong assumption (stronger than univariate case).
39 Case 1: Known Σ 1 and Σ 2 Assumptions Case 1: KnownΣ 1 and Σ 2 Case 2: Σ 1 andσ 2 Unknown Distribution of Linear Combination Distribution of Test Statistic Example: Two Independent SamplesT 2 Example continued 100(1 α)% Confidence Region for µ 1 µ 2 Example:Confidence Region Computing the Axes of the Ellipse Figure of 95% Confidence Region Simultaneous T 2 Intervals Simultaneous T 2 continued intervals Bonferroni and One-at-a-Time Intervals Comparisons of Two Means Slide 39 of 68 and Bonferroni To develop the test for independent populations, we ll start with supposing that we know Σ 1 and Σ 2 (i.e., we don t have to estimate them) and assume first 4 assumptions made on previous slide. The test statistic would be because ( x 1 x 2 ) ( 1 n 1 Σ n 2 Σ 2 ) 1 ( x 1 = x 2 ) χ 2 p ( x 1 x 2 ) N p ((µ 1 µ 2 ), Why is ( x 1 x 2 ) multivariate normal? 1 Σ ) Σ 2 n 1 n 2 When H o is true, then µ 1 µ 2 = 0 and the test statistic should be small.
40 Case 2: Σ 1 and Σ 2 Unknown Σ 1 and Σ 2 must be estimated. Assumptions Case 1: KnownΣ 1 and Σ 2 Case 2: Σ 1 andσ 2 Unknown Distribution of Linear Combination Distribution of Test Statistic Example: Two Independent SamplesT 2 Example continued 100(1 α)% Confidence Region for µ 1 µ 2 Example:Confidence Region Computing the Axes of the Ellipse Figure of 95% Confidence Region For this more realistic case, we must also assume Σ 1 = Σ 2 = Σ Since Σ 1 = Σ 2 = Σ, we will estimate Σ by pooling the data from the two samples: S pool = (n 1 1)S 1 +(n 2 1)S 2 n 1 +n 2 2 n1 j=1 = (x 1j x 1 )(x 1j x 1 ) + n 2 j=1 (x 2j x 2 )(x 2j x 2 ) n 1 +n 2 2 S pool is an estimator of Σ with df = n 1 +n 2 2. Simultaneous T 2 Intervals Simultaneous T 2 continued intervals Bonferroni and One-at-a-Time Intervals Comparisons of Two Means Slide 40 of 68 and Bonferroni
41 Distribution of Linear Combination Consider the linear combination of two random vectors x 1 x 2 E( x 1 x 2 ) = µ 1 µ 2 Assumptions Case 1: KnownΣ 1 and Σ 2 Case 2: Σ 1 andσ 2 Unknown Distribution of Linear Combination Distribution of Test Statistic Example: Two Independent SamplesT 2 Example continued 100(1 α)% Confidence Region for µ 1 µ 2 Example:Confidence Region Computing the Axes of the Ellipse Figure of 95% Confidence Region Simultaneous T 2 Intervals Simultaneous T 2 continued intervals Bonferroni and One-at-a-Time Intervals Comparisons of Two Means Slide 41 of 68 and Bonferroni Σ x 1 x 2 = cov( x 1 x 2 ) = cov( x 1 )+cov( x x ) independent samples = 1 Σ+ 1 Σ n 1 n ( 2 1 = + 1 ) Σ n 1 n 2 which is estimated by ( 1 n n 2 )S pool. When x 11,...,x 1n1 is a random sample of size n 1 from N(µ 1,Σ) and x 21,...,x 2n2 is a random sample of size n 2 from N(µ 2,Σ) then the test statistic for H o : µ 1 µ 2 = δ o T 2 = (( x 1 x 2 ) δ o ) (( 1 n n 2 )S pool ) 1 (( x 1 x 2 ) δ o )
42 Distribution of Test Statistic The test statistic T 2 = (( x 1 x 2 ) δ o ) (( 1 n n 2 )S pool ) 1 (( x 1 x 2 ) δ o ) Assumptions Case 1: KnownΣ 1 and Σ 2 Case 2: Σ 1 andσ 2 Unknown Distribution of Linear Combination Distribution of Test Statistic Example: Two Independent SamplesT 2 Example continued 100(1 α)% Confidence Region for µ 1 µ 2 Example:Confidence Region Computing the Axes of the Ellipse Figure of 95% Confidence Region Simultaneous T 2 Intervals Simultaneous T 2 continued intervals Bonferroni and One-at-a-Time Intervals Comparisons of Two Means Slide 42 of 68 and Bonferroni has a sampling distribution that is (n 1 +n 2 2)p (n 1 +n 2 p 1) F p,(n 1 +n 2 p1 1) or we could just refer (n 1 +n 2 p 1) (n 1 +n 2 2)p T2 to F p,(n1 +n 2 p1 1) Note: (( ) ) 1 S pool = n 1 n 2 So sometimes you ll see (( )) 1 n1 +n 2 S pool = n 1n 2 (S pool ) 1 n 1 n 2 n 1 +n 2 T 2 = n 1n 2 n 1 +n 2 (( x 1 x 2 ) δ o ) S 1 pool (( x 1 x 2 ) δ o )
43 Assumptions Case 1: KnownΣ 1 and Σ 2 Case 2: Σ 1 andσ 2 Unknown Distribution of Linear Combination Distribution of Test Statistic Example: Two Independent SamplesT 2 Example continued 100(1 α)% Confidence Region for µ 1 µ 2 Example:Confidence Region Computing the Axes of the Ellipse Figure of 95% Confidence Region Simultaneous T 2 Intervals Simultaneous T 2 continued intervals Bonferroni and One-at-a-Time Intervals Comparisons of Two Means Slide 43 of 68 and Bonferroni Example: T 2 From Johnson & Wichern: Wisconsin homeowners without airconditioning (n 1 = 45) and those with airconditioning (n 2 = 55). X 1 = total on-peak consumption of electricity July 1977 (in kilowatts) X 2 = total off-peak consumption of electricity July 1977(in kilowatts) S 1 = x 1 = (204.4,556.6) x 2 = (130.0,355.0) and ( x 1 x 2 ) = (74.4,201.6) S pool = 44S 1 +54S 2 98 S 2 = =
44 Example continued The estimated covariance matrix of ( x 1 x 2 ) is Assumptions Case 1: KnownΣ 1 and Σ 2 Case 2: Σ 1 andσ 2 Unknown Distribution of Linear Combination Distribution of Test Statistic Example: Two Independent SamplesT 2 Example continued 100(1 α)% Confidence Region for µ 1 µ 2 Example:Confidence Region Computing the Axes of the Ellipse Figure of 95% Confidence Region Simultaneous T 2 Intervals Simultaneous T 2 continued intervals Bonferroni and One-at-a-Time Intervals Comparisons of Two Means Slide 44 of 68 and Bonferroni S x 1 x 2 = ( )S pool n 1 n 2 ( = ( ) = ( To test H o : δ = (µ 1 µ 2 ) = 0, compute test statistic ( x 1 x 2 ) S 1 x 1 x 2 ( x 1 x 2 ) = (74,201.6) = For α =.05: (98(2)/97)F 2,97 (.05) = 2.02(3.1) = Conclusion... ) ( ) ) 1 ( )
45 100(1 α)% Confidence Region for µ 1 µ 2 Is the set of all δ = µ 1 µ 2 s such that Assumptions Case 1: KnownΣ 1 and Σ 2 Case 2: Σ 1 andσ 2 Unknown Distribution of Linear Combination Distribution of Test Statistic Example: Two Independent SamplesT 2 Example continued 100(1 α)% Confidence Region for µ 1 µ 2 Example:Confidence Region Computing the Axes of the Ellipse Figure of 95% Confidence Region where n 1 n 2 n 1 +n 2 (( x 1 x 2 ) δ) S 1 pool (( x 1 x 2 ) δ) c 2 c 2 = (n 1 +n 2 2)p (n 1 +n 2 p 1) F p,(n 1 +n 2 p 1)(α) To study the ellipsoid, we can focus on the eigenvalues and eigenvectors of S pool. The axes of the ellipsoid are ( x 1 x 2 )± λ i ( 1 n n 2 )c 2 e i i = 1,...,p where λ i and e i are the eigenvalues and eigenvectors of S pool. Simultaneous T 2 Intervals Simultaneous T 2 continued intervals Bonferroni and One-at-a-Time Intervals Comparisons of Two Means Slide 45 of 68 and Bonferroni
46 Assumptions Case 1: KnownΣ 1 and Σ 2 Case 2: Σ 1 andσ 2 Unknown Distribution of Linear Combination Distribution of Test Statistic Example: Two Independent SamplesT 2 Example continued 100(1 α)% Confidence Region for µ 1 µ 2 Example:Confidence Region Computing the Axes of the Ellipse Figure of 95% Confidence Region Simultaneous T 2 Intervals Simultaneous T 2 continued intervals Bonferroni and One-at-a-Time Intervals Comparisons of Two Means Slide 46 of 68 and Bonferroni Example:Confidence Region The 95% Confidence Region (Ellipse): The set of all possible (µ 1 µ 2 ) that satisfy the following equation: ( ) 1 ( (74.4 δ 1 ) ((74.4 δ 1 ),(201.6 δ 2 )) (201.6 δ 2 ) where c 2 = (98(2)/97)F 2,97 (.05) = 2.02(3.1) = Eigenvalues and Eigenvectors of S pool are ( λ 1 = , e 1 = and λ 2 = , e 2 = ( ) ) ) c 2
47 Assumptions Case 1: KnownΣ 1 and Σ 2 Case 2: Σ 1 andσ 2 Unknown Distribution of Linear Combination Distribution of Test Statistic Example: Two Independent SamplesT 2 Example continued 100(1 α)% Confidence Region for µ 1 µ 2 Example:Confidence Region Computing the Axes of the Ellipse Figure of 95% Confidence Region Simultaneous T 2 Intervals Simultaneous T 2 continued intervals Bonferroni and One-at-a-Time Intervals Comparisons of Two Means Slide 47 of 68 and Bonferroni Computing the Axes of the Ellipse Major axis ( ) Minor axis ( ) ± λ 1 ( 1 n n 2 )c 2 e 1 ± ( ) ( ( ) , ± ( ) ( ( ) , ) )
48 Figure of 95% Confidence Region µ 12 µ 22 (off-peak) 300 Assumptions Case 1: KnownΣ 1 and Σ 2 Case 2: Σ 1 andσ 2 Unknown Distribution of Linear Combination Distribution of Test Statistic Example: Two Independent SamplesT 2 Example continued 100(1 α)% Confidence Region for µ 1 µ 2 Example:Confidence Region Computing the Axes of the Ellipse Figure of 95% Confidence Region Simultaneous T 2 Intervals Simultaneous T 2 continued intervals Bonferroni and One-at-a-Time Intervals Comparisons of Two Means Slide 48 of 68 and Bonferroni δ o = (0,0) d = (74.4,201.6) µ 11 µ 21 (on-peak)
49 Simultaneous T 2 Intervals Assumptions Case 1: KnownΣ 1 and Σ 2 Case 2: Σ 1 andσ 2 Unknown Distribution of Linear Combination Distribution of Test Statistic Example: Two Independent SamplesT 2 Example continued 100(1 α)% Confidence Region for µ 1 µ 2 Example:Confidence Region Computing the Axes of the Ellipse Figure of 95% Confidence Region Let c 2 = (n 1 +n 2 2)p (n 1 +n 2 p 1) F p,(n 1 +n 2 p 1)(α) With confidence 100(1 α)% ( ) a ( x 1 x 2 )±c a 1n1 + 1n2 S pool a will cover a (µ 1 µ 2 ) for all possible a. By appropriate choices for a, we can get component intervals: a 1 =, a 1 2 =,, a 0 p = Simultaneous T 2 Intervals Simultaneous T 2 continued intervals Bonferroni and One-at-a-Time Intervals Comparisons of Two Means Slide 49 of 68 and Bonferroni
50 Simultaneous T 2 continued Assumptions Case 1: KnownΣ 1 and Σ 2 Case 2: Σ 1 andσ 2 Unknown Distribution of Linear Combination Distribution of Test Statistic Example: Two Independent SamplesT 2 Example continued 100(1 α)% Confidence Region for µ 1 µ 2 Example:Confidence Region Computing the Axes of the Ellipse Figure of 95% Confidence Region So the component intervals are ( 1 ( x 11 x 21 ) ± c + 1 ) n 1 n 2 ( 1 ( x 12 x 22 ) ± c + 1 ) n 1 n 2 where... ( x 1p x 2p ) ± c c = ( 1 n n 2 ) s pool,11 s pool,22 s pool,pp (n 1 +n 2 2)p (n 1 +n 2 p 1) F p,(n 1 +n 2 p 1)(α) Simultaneous T 2 Intervals Simultaneous T 2 continued intervals Bonferroni and One-at-a-Time Intervals Comparisons of Two Means Slide 50 of 68 and Bonferroni
51 Example: Simultaneous T 2 intervals Consider the linear combination vectors: a 1 = (1,0) So a 1δ = a 1(µ 1 µ 2 ) = µ 11 µ 21 = δ 1 and Assumptions Case 1: KnownΣ 1 and Σ 2 Case 2: Σ 1 andσ 2 Unknown Distribution of Linear Combination Distribution of Test Statistic Example: Two Independent SamplesT 2 Example continued 100(1 α)% Confidence Region for µ 1 µ 2 Example:Confidence Region Computing the Axes of the Ellipse Figure of 95% Confidence Region a 2 = (0,1) So a 2δ = a 2(µ 1 µ 2 ) = µ 12 µ 22 = δ 2 Using these we get the intervals for on-peak 74.4±(2.502) δ and for off-peak 201.6±(2.502) δ Note: c 2 = 6.26 = Simultaneous T 2 Intervals Simultaneous T 2 continued intervals Bonferroni and One-at-a-Time Intervals Comparisons of Two Means Slide 51 of 68 and Bonferroni
52 Bonferroni and One-at-a-Time Intervals For Bonferroni and One-at-a-Time (i.e., univariate method) intervals, you simply need to change the value of c. Assumptions Case 1: KnownΣ 1 and Σ 2 Case 2: Σ 1 andσ 2 Unknown Distribution of Linear Combination Distribution of Test Statistic Example: Two Independent SamplesT 2 Example continued 100(1 α)% Confidence Region for µ 1 µ 2 Example:Confidence Region Computing the Axes of the Ellipse Figure of 95% Confidence Region Bonferroni c = t n1 +n 2 2(α/2m) where m = number of intervals formed (probably p, but no more). These should be planned a priori. One-at-a-Time c = t n1 +n 2 2(α/2) Simultaneous T 2 Intervals Simultaneous T 2 continued intervals Bonferroni and One-at-a-Time Intervals Comparisons of Two Means Slide 52 of 68 and Bonferroni
53 Example: Simultaneous T 2 and Bonferroni µ 12 µ 22 (off-peak) 300 Assumptions Case 1: KnownΣ 1 and Σ 2 Case 2: Σ 1 andσ 2 Unknown Distribution of Linear Combination Distribution of Test Statistic Example: Two Independent SamplesT 2 Example continued 100(1 α)% Confidence Region for µ 1 µ 2 Example:Confidence Region Computing the Axes of the Ellipse Figure of 95% Confidence Region Simultaneous T 2 Intervals Simultaneous T 2 continued intervals Bonferroni and One-at-a-Time Intervals Comparisons of Two Means Slide 53 of 68 and Bonferroni δ o = (0,0) d = (74.4,201.6) µ 11 µ 21 (on-peak)
54 Case 3: Large n 1 p and n 2 p If n 1 p and n 2 p are large, then we do NOT need to assume: Assumptions Case 1: KnownΣ 1 and Σ 2 Case 2: Σ 1 andσ 2 Unknown Distribution of Linear Combination Distribution of Test Statistic Example: Two Independent SamplesT 2 Example continued 100(1 α)% Confidence Region for µ 1 µ 2 Example:Confidence Region Computing the Axes of the Ellipse Figure of 95% Confidence Region Σ 1 = Σ 2. x 1j multivariate normal. x 2j multivariate normal. We do need to assume that Observations between populations are independent. x 11,...x 1,n1 are a random sample from population 1 with µ 1 and Σ 1. x 21,...x 2,n2 are a random sample from population 2 with µ 2 and Σ 2. If n 1 p and n 2 p are large, then an approximate sampling distribution for the test statistic T 2 is χ 2 p. Simultaneous T 2 Intervals Simultaneous T 2 continued intervals Bonferroni and One-at-a-Time Intervals Comparisons of Two Means Slide 54 of 68 and Bonferroni
55 Large Sample Case To test Estimate the covariance matrix of the differences Σ x 1 x 2... remember case 1? Assumptions Case 1: KnownΣ 1 and Σ 2 Case 2: Σ 1 andσ 2 Unknown Distribution of Linear Combination Distribution of Test Statistic Example: Two Independent SamplesT 2 Example continued 100(1 α)% Confidence Region for µ 1 µ 2 Example:Confidence Region Computing the Axes of the Ellipse Figure of 95% Confidence Region which we can estimate using Σ x 1 x 2 = Σ x 1 +Σ x 2 = 1 n 1 Σ n 2 Σ 2 1 n 1 S n 2 S 2 Test statistic for H o : µ 1 µ 2 = δ o T 2 = (( x 1 x 2 ) δ o ) ( 1 n 1 S n 2 S 2 ) 1 (( x 1 x 2 ) δ o ) χ 2 p Simultaneous T 2 Intervals Simultaneous T 2 continued intervals Bonferroni and One-at-a-Time Intervals Comparisons of Two Means Slide 55 of 68 and Bonferroni
56 Large Sample Case continued A 100(1 α)% Confidence region (ellipsoid) for δ = µ 1 µ 2 is the set of all δ that satisfy Assumptions Case 1: KnownΣ 1 and Σ 2 Case 2: Σ 1 andσ 2 Unknown Distribution of Linear Combination Distribution of Test Statistic Example: Two Independent SamplesT 2 Example continued 100(1 α)% Confidence Region for µ 1 µ 2 Example:Confidence Region Computing the Axes of the Ellipse Figure of 95% Confidence Region (( x 1 x 2 ) δ) ( 1 n 1 S n 2 S 2 ) 1 ( x 1 x 2 ) δ) χ 2 p(α) For 100(1 α)% simultaneous χ 2 intervals ( ) a ( x 1 x 2 )± χ 2 p(α) a 1n1 S 1 + 1n2 S 2 a Let s try this for the air conditioner data... Simultaneous T 2 Intervals Simultaneous T 2 continued intervals Bonferroni and One-at-a-Time Intervals Comparisons of Two Means Slide 56 of 68 and Bonferroni
57 Assumptions Case 1: KnownΣ 1 and Σ 2 Case 2: Σ 1 andσ 2 Unknown Distribution of Linear Combination Distribution of Test Statistic Example: Two Independent SamplesT 2 Example continued 100(1 α)% Confidence Region for µ 1 µ 2 Example:Confidence Region Computing the Axes of the Ellipse Figure of 95% Confidence Region Example using Large Sample What if Σ 1 Σ 2? n 1 and n 2 may be large enough to use the large sample theory. ( ) ( 1 S S 2 = n 1 n ( ) = [ 1 S ] ( 1 S 2 = n 1 n ) 10 4 ) Simultaneous T 2 Intervals Simultaneous T 2 continued intervals Bonferroni and One-at-a-Time Intervals Comparisons of Two Means Slide 57 of 68 and Bonferroni
58 Example: Large Sample Test Statistic Test H o : δ = 0: Test statistic is ( x 1 x 2 ) [ 1 n 1 S n 2 S 2 ] 1 ( x 1 x 2 ) Assumptions Case 1: KnownΣ 1 and Σ 2 Case 2: Σ 1 andσ 2 Unknown Distribution of Linear Combination Distribution of Test Statistic Example: Two Independent SamplesT 2 Example continued 100(1 α)% Confidence Region for µ 1 µ 2 Example:Confidence Region Computing the Axes of the Ellipse Figure of 95% Confidence Region Simultaneous T 2 Intervals Simultaneous T 2 continued intervals Bonferroni and One-at-a-Time Intervals Comparisons of Two Means Slide 58 of 68 and Bonferroni = (( ),( )) = (10 4 ) which for α =.05, the critical value from χ 2 p of 5.99 (the p-value <.005) Compare this with T 2 = using S pool (where we assumed that Σ 1 = Σ 2 )
59 Large Sample χ 2 Intervals Using the same the linear combination vectors as above: Assumptions Case 1: KnownΣ 1 and Σ 2 Case 2: Σ 1 andσ 2 Unknown Distribution of Linear Combination Distribution of Test Statistic Example: Two Independent SamplesT 2 Example continued 100(1 α)% Confidence Region for µ 1 µ 2 Example:Confidence Region Computing the Axes of the Ellipse Figure of 95% Confidence Region Simultaneous T 2 Intervals Simultaneous T 2 continued intervals Bonferroni and One-at-a-Time Intervals Comparisons of Two Means Slide 59 of 68 and Bonferroni and a 1 = (1,0) so a 1δ = a 1(µ 1 µ 2 ) = µ 11 µ 21 a 2 = (0,1) so a 2δ = a 2(µ 1 µ 2 ) = µ 12 µ 22 ( )± = (21.7,127.1) ( )± = (75.8,327.4) which are very similar to the T 2 intervals given previously Note: X 2 2(.05) = 5.99
60 Sample Sample with n 1 = n 2 We obtained similar results in our large and small sample procedures; however, one possible reason stems from n 1 n 2. Note that when n 1 = n 2 = n (n 1) n+n 2 = 1 2 Assumptions Case 1: KnownΣ 1 and Σ 2 Case 2: Σ 1 andσ 2 Unknown Distribution of Linear Combination Distribution of Test Statistic Example: Two Independent SamplesT 2 Example continued 100(1 α)% Confidence Region for µ 1 µ 2 Example:Confidence Region Computing the Axes of the Ellipse Figure of 95% Confidence Region Simultaneous T 2 Intervals Simultaneous T 2 continued intervals Bonferroni and One-at-a-Time Intervals Comparisons of Two Means Slide 60 of 68 and Bonferroni 1 n S n S 2 = 1 ( ) (n 1) n (S 1 +S 2 ) = 2 n+n 2 }{{} = 2 n n+n 2 ( 1 = n + 1 ) S pool n =1 ( ) (n 1)S1 +(n 1)S 2 1 n (S 1 +S 2 ) This implies that with equal samples, the large sample procedure for computing an estimate of Σ x 1 x 2 is essentially the same as the procedure based on pooled covariance matrix.
61 Case 4: Small sample with Σ 1 Σ 2 We should consider whether Σ 1 = Σ 2 is a reasonable assumption. Assumptions Case 1: KnownΣ 1 and Σ 2 Case 2: Σ 1 andσ 2 Unknown Distribution of Linear Combination Distribution of Test Statistic Example: Two Independent SamplesT 2 Example continued 100(1 α)% Confidence Region for µ 1 µ 2 Example:Confidence Region Computing the Axes of the Ellipse Figure of 95% Confidence Region Simultaneous T 2 Intervals Simultaneous T 2 continued intervals Bonferroni and One-at-a-Time Intervals Comparisons of Two Means Slide 61 of 68 and Bonferroni If n 1 p and n 2 p are small and Σ 1 Σ 2, then there s no nice measure like T 2 whose distribution does not depend on Σ 1 and Σ 2. Rule-of-Thumb for when to worry about Σ 1 Σ 2 : Don t worry if ratios σ 1,ik /σ 2,ik 4 (or σ 2,ik /σ 1,ik 4). Our air conditioner example: (1, 1) / = 1.60 (1, 2) / = 1.21 (2, 2) / = 1.31 all 4
62 Testing whether Σ 1 = Σ 2 Assumptions Case 1: KnownΣ 1 and Σ 2 Case 2: Σ 1 andσ 2 Unknown Distribution of Linear Combination Distribution of Test Statistic Example: Two Independent SamplesT 2 Example continued 100(1 α)% Confidence Region for µ 1 µ 2 Example:Confidence Region Computing the Axes of the Ellipse Figure of 95% Confidence Region We could use Bartlet s test, but this assumes Data are multivariate normal (not just that the means are multivariate normal). Σ 1 = Σ 2. So if you reject H o (significant test statistics), it could be because Σ 1 Σ 2 Data are not normal. Or both Σ 1 Σ 2 and Data are not normal. Additionally for a valid test you need large samples, but if you have large samples you don t need to assumed that Σ 1 = Σ 2 (or normality of the data). Simultaneous T 2 Intervals Simultaneous T 2 continued intervals Bonferroni and One-at-a-Time Intervals Comparisons of Two Means Slide 62 of 68 and Bonferroni
63 Revisiting Examining Why Our motivation for computing confidence intervals for components of mean vector was to come to conclusion about individual means. The simultaneous T 2 intervals hold for any a. Assumptions Case 1: KnownΣ 1 and Σ 2 Case 2: Σ 1 andσ 2 Unknown Distribution of Linear Combination Distribution of Test Statistic Example: Two Independent SamplesT 2 Example continued 100(1 α)% Confidence Region for µ 1 µ 2 Example:Confidence Region Computing the Axes of the Ellipse Figure of 95% Confidence Region Simultaneous T 2 Intervals Simultaneous T 2 continued intervals Bonferroni and One-at-a-Time Intervals Comparisons of Two Means Slide 63 of 68 and Bonferroni The a that leads to the largest population difference is proportional to S 1 pool ( x 1 x 2 ) = a If null hypothesis using T 2 is rejected, then a ( x 1 x 2 ) has the largest possible statistic a ( x 1 x 2 ) = ( x 1 x 2 ) S 1 pool ( x 1 x 2 ) which is a multiple of T 2. a is useful for interpreting and describing why H o was rejected.
64 Interpretation Assumptions Case 1: KnownΣ 1 and Σ 2 Case 2: Σ 1 andσ 2 Unknown Distribution of Linear Combination Distribution of Test Statistic Example: Two Independent SamplesT 2 Example continued 100(1 α)% Confidence Region for µ 1 µ 2 Example:Confidence Region Computing the Axes of the Ellipse Figure of 95% Confidence Region For the air conditioner data (using large sample), a is proportional to ( )( ) ( ( ) = So the difference in X 2 (off-peak consumption) contributes more (.063 >.041) to the rejection of H o : µ 1 µ 2 = 0 via T 2 test than X 1 (on-peak energy consumption). Note: a (µ 1 µ 2 ) = (.041(µ 11 µ 21 ).063(µ 12 µ 22 ) ) ) Simultaneous T 2 Intervals Simultaneous T 2 continued intervals Bonferroni and One-at-a-Time Intervals Comparisons of Two Means Slide 64 of 68 and Bonferroni
65 Summary regarding Inferences about µ Four reasons for taking a multivariate approach to hypothesis testing: Summary Summary regarding Inferences about µ Error Rates & More Reasons Reason 4 A couple of final notes Reason 1: If you do p univariate (t) tests, you have an inflated type I error rate (i.e., actual α larger than you want it to be). With a multivariate test, the exact α level is under your control..g., If p = 5 and you perform p separate univariate tests all at α =.05, then Prob{at least 1 false rejection} = Prob{at leat 1 Type I error} >.05 In the extreme case where all the variables are independent, if H o is true Prob{at least 1 false rejection} = 1 Prob{all P retained} = 1 (1 α) p Comparisons of Two Means Slide 65 of 68
66 Error Rates & More Reasons Overall error rates are somewhere between For p = 5 =.05 and.23 For p = 10 =.05 and.40. Summary Summary regarding Inferences about µ Error Rates & More Reasons Reason 4 A couple of final notes Reason 2: Univariate tests ignore (completely) the correlations between the variables. Multivariate tests make direct use of the covariance matrix. Reason 3: Multivariate tests are more powerful (in most cases). Sometimes all p univariate tests fail to reach significance, but multivariate test is significant because small effects combine to jointly indicate significance. Note: For a given sample size, there is a limit to the number of variables a multivariate test can handle without losing power. Comparisons of Two Means Slide 66 of 68
Inferences about a Mean Vector
Inferences about a Mean Vector Edps/Soc 584, Psych 594 Carolyn J. Anderson Department of Educational Psychology I L L I N O I S university of illinois at urbana-champaign c Board of Trustees, University
More informationI L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN
Introduction Edps/Psych/Stat/ 584 Applied Multivariate Statistics Carolyn J Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN c Board of Trustees,
More informationApplied Multivariate and Longitudinal Data Analysis
Applied Multivariate and Longitudinal Data Analysis Chapter 2: Inference about the mean vector(s) Ana-Maria Staicu SAS Hall 5220; 919-515-0644; astaicu@ncsu.edu 1 In this chapter we will discuss inference
More informationLecture 5: Hypothesis tests for more than one sample
1/23 Lecture 5: Hypothesis tests for more than one sample Måns Thulin Department of Mathematics, Uppsala University thulin@math.uu.se Multivariate Methods 8/4 2011 2/23 Outline Paired comparisons Repeated
More informationSample Geometry. Edps/Soc 584, Psych 594. Carolyn J. Anderson
Sample Geometry Edps/Soc 584, Psych 594 Carolyn J. Anderson Department of Educational Psychology I L L I N O I S university of illinois at urbana-champaign c Board of Trustees, University of Illinois Spring
More informationMore Linear Algebra. Edps/Soc 584, Psych 594. Carolyn J. Anderson
More Linear Algebra Edps/Soc 584, Psych 594 Carolyn J. Anderson Department of Educational Psychology I L L I N O I S university of illinois at urbana-champaign c Board of Trustees, University of Illinois
More informationMean Vector Inferences
Mean Vector Inferences Lecture 5 September 21, 2005 Multivariate Analysis Lecture #5-9/21/2005 Slide 1 of 34 Today s Lecture Inferences about a Mean Vector (Chapter 5). Univariate versions of mean vector
More information5 Inferences about a Mean Vector
5 Inferences about a Mean Vector In this chapter we use the results from Chapter 2 through Chapter 4 to develop techniques for analyzing data. A large part of any analysis is concerned with inference that
More informationRejection regions for the bivariate case
Rejection regions for the bivariate case The rejection region for the T 2 test (and similarly for Z 2 when Σ is known) is the region outside of an ellipse, for which there is a (1-α)% chance that the test
More informationI L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN
Comparisons of Several Multivariate Populations Edps/Soc 584 and Psych 594 Applied Multivariate Statistics Carolyn J Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS
More informationI L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN
Canonical Edps/Soc 584 and Psych 594 Applied Multivariate Statistics Carolyn J. Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Canonical Slide
More informationComparisons of Several Multivariate Populations
Comparisons of Several Multivariate Populations Edps/Soc 584, Psych 594 Carolyn J Anderson Department of Educational Psychology I L L I N O I S university of illinois at urbana-champaign c Board of Trustees,
More informationMultivariate Statistical Analysis
Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 9 for Applied Multivariate Analysis Outline Two sample T 2 test 1 Two sample T 2 test 2 Analogous to the univariate context, we
More informationI L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN
Linear Combinations of Variables Edps/Soc 584 and Psych 594 Applied Multivariate Statistics Carolyn J Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN
More informationSTA 437: Applied Multivariate Statistics
Al Nosedal. University of Toronto. Winter 2015 1 Chapter 5. Tests on One or Two Mean Vectors If you can t explain it simply, you don t understand it well enough Albert Einstein. Definition Chapter 5. Tests
More informationI L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN
Principal Analysis Edps/Soc 584 and Psych 594 Applied Multivariate Statistics Carolyn J. Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN c Board
More informationTHE UNIVERSITY OF CHICAGO Booth School of Business Business 41912, Spring Quarter 2012, Mr. Ruey S. Tsay
THE UNIVERSITY OF CHICAGO Booth School of Business Business 41912, Spring Quarter 2012, Mr. Ruey S. Tsay Lecture 3: Comparisons between several multivariate means Key concepts: 1. Paired comparison & repeated
More informationChapter 9. Hotelling s T 2 Test. 9.1 One Sample. The one sample Hotelling s T 2 test is used to test H 0 : µ = µ 0 versus
Chapter 9 Hotelling s T 2 Test 9.1 One Sample The one sample Hotelling s T 2 test is used to test H 0 : µ = µ 0 versus H A : µ µ 0. The test rejects H 0 if T 2 H = n(x µ 0 ) T S 1 (x µ 0 ) > n p F p,n
More informationHotelling s One- Sample T2
Chapter 405 Hotelling s One- Sample T2 Introduction The one-sample Hotelling s T2 is the multivariate extension of the common one-sample or paired Student s t-test. In a one-sample t-test, the mean response
More informationMultivariate Statistical Analysis
Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 9 for Applied Multivariate Analysis Outline Addressing ourliers 1 Addressing ourliers 2 Outliers in Multivariate samples (1) For
More informationYou can compute the maximum likelihood estimate for the correlation
Stat 50 Solutions Comments on Assignment Spring 005. (a) _ 37.6 X = 6.5 5.8 97.84 Σ = 9.70 4.9 9.70 75.05 7.80 4.9 7.80 4.96 (b) 08.7 0 S = Σ = 03 9 6.58 03 305.6 30.89 6.58 30.89 5.5 (c) You can compute
More informationApplied Multivariate and Longitudinal Data Analysis
Applied Multivariate and Longitudinal Data Analysis Chapter 2: Inference about the mean vector(s) II Ana-Maria Staicu SAS Hall 5220; 919-515-0644; astaicu@ncsu.edu 1 1 Compare Means from More Than Two
More informationSTAT 501 Assignment 2 NAME Spring Chapter 5, and Sections in Johnson & Wichern.
STAT 01 Assignment NAME Spring 00 Reading Assignment: Written Assignment: Chapter, and Sections 6.1-6.3 in Johnson & Wichern. Due Monday, February 1, in class. You should be able to do the first four problems
More information1. Density and properties Brief outline 2. Sampling from multivariate normal and MLE 3. Sampling distribution and large sample behavior of X and S 4.
Multivariate normal distribution Reading: AMSA: pages 149-200 Multivariate Analysis, Spring 2016 Institute of Statistics, National Chiao Tung University March 1, 2016 1. Density and properties Brief outline
More informationProfile Analysis Multivariate Regression
Lecture 8 October 12, 2005 Analysis Lecture #8-10/12/2005 Slide 1 of 68 Today s Lecture Profile analysis Today s Lecture Schedule : regression review multiple regression is due Thursday, October 27th,
More informationRandom Vectors 1. STA442/2101 Fall See last slide for copyright information. 1 / 30
Random Vectors 1 STA442/2101 Fall 2017 1 See last slide for copyright information. 1 / 30 Background Reading: Renscher and Schaalje s Linear models in statistics Chapter 3 on Random Vectors and Matrices
More informationThe t-distribution. Patrick Breheny. October 13. z tests The χ 2 -distribution The t-distribution Summary
Patrick Breheny October 13 Patrick Breheny Biostatistical Methods I (BIOS 5710) 1/25 Introduction Introduction What s wrong with z-tests? So far we ve (thoroughly!) discussed how to carry out hypothesis
More informationWithin Cases. The Humble t-test
Within Cases The Humble t-test 1 / 21 Overview The Issue Analysis Simulation Multivariate 2 / 21 Independent Observations Most statistical models assume independent observations. Sometimes the assumption
More information3. (a) (8 points) There is more than one way to correctly express the null hypothesis in matrix form. One way to state the null hypothesis is
Stat 501 Solutions and Comments on Exam 1 Spring 005-4 0-4 1. (a) (5 points) Y ~ N, -1-4 34 (b) (5 points) X (X,X ) = (5,8) ~ N ( 11.5, 0.9375 ) 3 1 (c) (10 points, for each part) (i), (ii), and (v) are
More informationYORK UNIVERSITY. Faculty of Science Department of Mathematics and Statistics MATH M Test #1. July 11, 2013 Solutions
YORK UNIVERSITY Faculty of Science Department of Mathematics and Statistics MATH 222 3. M Test # July, 23 Solutions. For each statement indicate whether it is always TRUE or sometimes FALSE. Note: For
More informationChapter 7, continued: MANOVA
Chapter 7, continued: MANOVA The Multivariate Analysis of Variance (MANOVA) technique extends Hotelling T 2 test that compares two mean vectors to the setting in which there are m 2 groups. We wish to
More informationStatistical Distribution Assumptions of General Linear Models
Statistical Distribution Assumptions of General Linear Models Applied Multilevel Models for Cross Sectional Data Lecture 4 ICPSR Summer Workshop University of Colorado Boulder Lecture 4: Statistical Distributions
More informationChapter 24. Comparing Means. Copyright 2010 Pearson Education, Inc.
Chapter 24 Comparing Means Copyright 2010 Pearson Education, Inc. Plot the Data The natural display for comparing two groups is boxplots of the data for the two groups, placed side-by-side. For example:
More informationUnconstrained Ordination
Unconstrained Ordination Sites Species A Species B Species C Species D Species E 1 0 (1) 5 (1) 1 (1) 10 (4) 10 (4) 2 2 (3) 8 (3) 4 (3) 12 (6) 20 (6) 3 8 (6) 20 (6) 10 (6) 1 (2) 3 (2) 4 4 (5) 11 (5) 8 (5)
More informationNonparametric Location Tests: k-sample
Nonparametric Location Tests: k-sample Nathaniel E. Helwig Assistant Professor of Psychology and Statistics University of Minnesota (Twin Cities) Updated 04-Jan-2017 Nathaniel E. Helwig (U of Minnesota)
More informationStat 427/527: Advanced Data Analysis I
Stat 427/527: Advanced Data Analysis I Review of Chapters 1-4 Sep, 2017 1 / 18 Concepts you need to know/interpret Numerical summaries: measures of center (mean, median, mode) measures of spread (sample
More informationStat 206: Sampling theory, sample moments, mahalanobis
Stat 206: Sampling theory, sample moments, mahalanobis topology James Johndrow (adapted from Iain Johnstone s notes) 2016-11-02 Notation My notation is different from the book s. This is partly because
More informationAsymptotic Statistics-VI. Changliang Zou
Asymptotic Statistics-VI Changliang Zou Kolmogorov-Smirnov distance Example (Kolmogorov-Smirnov confidence intervals) We know given α (0, 1), there is a well-defined d = d α,n such that, for any continuous
More informationSOME ASPECTS OF MULTIVARIATE BEHRENS-FISHER PROBLEM
SOME ASPECTS OF MULTIVARIATE BEHRENS-FISHER PROBLEM Junyong Park Bimal Sinha Department of Mathematics/Statistics University of Maryland, Baltimore Abstract In this paper we discuss the well known multivariate
More informationLecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2
Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2 Fall, 2013 Page 1 Random Variable and Probability Distribution Discrete random variable Y : Finite possible values {y
More informationLecture 3. Inference about multivariate normal distribution
Lecture 3. Inference about multivariate normal distribution 3.1 Point and Interval Estimation Let X 1,..., X n be i.i.d. N p (µ, Σ). We are interested in evaluation of the maximum likelihood estimates
More informationPHP2510: Principles of Biostatistics & Data Analysis. Lecture X: Hypothesis testing. PHP 2510 Lec 10: Hypothesis testing 1
PHP2510: Principles of Biostatistics & Data Analysis Lecture X: Hypothesis testing PHP 2510 Lec 10: Hypothesis testing 1 In previous lectures we have encountered problems of estimating an unknown population
More informationMANOVA is an extension of the univariate ANOVA as it involves more than one Dependent Variable (DV). The following are assumptions for using MANOVA:
MULTIVARIATE ANALYSIS OF VARIANCE MANOVA is an extension of the univariate ANOVA as it involves more than one Dependent Variable (DV). The following are assumptions for using MANOVA: 1. Cell sizes : o
More informationMATH5745 Multivariate Methods Lecture 07
MATH5745 Multivariate Methods Lecture 07 Tests of hypothesis on covariance matrix March 16, 2018 MATH5745 Multivariate Methods Lecture 07 March 16, 2018 1 / 39 Test on covariance matrices: Introduction
More informationSTAT 501 EXAM I NAME Spring 1999
STAT 501 EXAM I NAME Spring 1999 Instructions: You may use only your calculator and the attached tables and formula sheet. You can detach the tables and formula sheet from the rest of this exam. Show your
More informationMultivariate Statistics
Multivariate Statistics Chapter 2: Multivariate distributions and inference Pedro Galeano Departamento de Estadística Universidad Carlos III de Madrid pedro.galeano@uc3m.es Course 2016/2017 Master in Mathematical
More information2.830J / 6.780J / ESD.63J Control of Manufacturing Processes (SMA 6303) Spring 2008
MIT OpenCourseWare http://ocw.mit.edu 2.830J / 6.780J / ESD.63J Control of Processes (SMA 6303) Spring 2008 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
More informationMultiple Linear Regression
Multiple Linear Regression Simple linear regression tries to fit a simple line between two variables Y and X. If X is linearly related to Y this explains some of the variability in Y. In most cases, there
More informationSerial Correlation. Edps/Psych/Stat 587. Carolyn J. Anderson. Fall Department of Educational Psychology
Serial Correlation Edps/Psych/Stat 587 Carolyn J. Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Fall 017 Model for Level 1 Residuals There are three sources
More information1 Hypothesis testing for a single mean
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
More informationProbability and Statistics Notes
Probability and Statistics Notes Chapter Seven Jesse Crawford Department of Mathematics Tarleton State University Spring 2011 (Tarleton State University) Chapter Seven Notes Spring 2011 1 / 42 Outline
More informationStat 710: Mathematical Statistics Lecture 31
Stat 710: Mathematical Statistics Lecture 31 Jun Shao Department of Statistics University of Wisconsin Madison, WI 53706, USA Jun Shao (UW-Madison) Stat 710, Lecture 31 April 13, 2009 1 / 13 Lecture 31:
More informationLecture 11. Multivariate Normal theory
10. Lecture 11. Multivariate Normal theory Lecture 11. Multivariate Normal theory 1 (1 1) 11. Multivariate Normal theory 11.1. Properties of means and covariances of vectors Properties of means and covariances
More informationHypothesis Testing. Robert L. Wolpert Department of Statistical Science Duke University, Durham, NC, USA
Hypothesis Testing Robert L. Wolpert Department of Statistical Science Duke University, Durham, NC, USA An Example Mardia et al. (979, p. ) reprint data from Frets (9) giving the length and breadth (in
More informationDistribution-Free Procedures (Devore Chapter Fifteen)
Distribution-Free Procedures (Devore Chapter Fifteen) MATH-5-01: Probability and Statistics II Spring 018 Contents 1 Nonparametric Hypothesis Tests 1 1.1 The Wilcoxon Rank Sum Test........... 1 1. Normal
More informationStatistical Inference: Estimation and Confidence Intervals Hypothesis Testing
Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing 1 In most statistics problems, we assume that the data have been generated from some unknown probability distribution. We desire
More informationIntroduction to Statistical Inference Lecture 10: ANOVA, Kruskal-Wallis Test
Introduction to Statistical Inference Lecture 10: ANOVA, Kruskal-Wallis Test la Contents The two sample t-test generalizes into Analysis of Variance. In analysis of variance ANOVA the population consists
More information2 Hand-out 2. Dr. M. P. M. M. M c Loughlin Revised 2018
Math 403 - P. & S. III - Dr. McLoughlin - 1 2018 2 Hand-out 2 Dr. M. P. M. M. M c Loughlin Revised 2018 3. Fundamentals 3.1. Preliminaries. Suppose we can produce a random sample of weights of 10 year-olds
More informationOne-way ANOVA (Single-Factor CRD)
One-way ANOVA (Single-Factor CRD) STAT:5201 Week 3: Lecture 3 1 / 23 One-way ANOVA We have already described a completed randomized design (CRD) where treatments are randomly assigned to EUs. There is
More informationClass 24. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700
Class 4 Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science Copyright 013 by D.B. Rowe 1 Agenda: Recap Chapter 9. and 9.3 Lecture Chapter 10.1-10.3 Review Exam 6 Problem Solving
More informationChapter 12 - Lecture 2 Inferences about regression coefficient
Chapter 12 - Lecture 2 Inferences about regression coefficient April 19th, 2010 Facts about slope Test Statistic Confidence interval Hypothesis testing Test using ANOVA Table Facts about slope In previous
More informationSociology 6Z03 Review II
Sociology 6Z03 Review II John Fox McMaster University Fall 2016 John Fox (McMaster University) Sociology 6Z03 Review II Fall 2016 1 / 35 Outline: Review II Probability Part I Sampling Distributions Probability
More informationThe Multivariate Normal Distribution 1
The Multivariate Normal Distribution 1 STA 302 Fall 2014 1 See last slide for copyright information. 1 / 37 Overview 1 Moment-generating Functions 2 Definition 3 Properties 4 χ 2 and t distributions 2
More informationThe Random Effects Model Introduction
The Random Effects Model Introduction Sometimes, treatments included in experiment are randomly chosen from set of all possible treatments. Conclusions from such experiment can then be generalized to other
More informationSTAT 501 Assignment 1 Name Spring 2005
STAT 50 Assignment Name Spring 005 Reading Assignment: Johnson and Wichern, Chapter, Sections.5 and.6, Chapter, and Chapter. Review matrix operations in Chapter and Supplement A. Written Assignment: Due
More information8 Eigenvectors and the Anisotropic Multivariate Gaussian Distribution
Eigenvectors and the Anisotropic Multivariate Gaussian Distribution Eigenvectors and the Anisotropic Multivariate Gaussian Distribution EIGENVECTORS [I don t know if you were properly taught about eigenvectors
More informationHYPOTHESIS TESTING. Hypothesis Testing
MBA 605 Business Analytics Don Conant, PhD. HYPOTHESIS TESTING Hypothesis testing involves making inferences about the nature of the population on the basis of observations of a sample drawn from the population.
More informationST505/S697R: Fall Homework 2 Solution.
ST505/S69R: Fall 2012. Homework 2 Solution. 1. 1a; problem 1.22 Below is the summary information (edited) from the regression (using R output); code at end of solution as is code and output for SAS. a)
More informationInstitute of Actuaries of India
Institute of Actuaries of India Subject CT3 Probability & Mathematical Statistics May 2011 Examinations INDICATIVE SOLUTION Introduction The indicative solution has been written by the Examiners with the
More informationSTAT 512 MidTerm I (2/21/2013) Spring 2013 INSTRUCTIONS
STAT 512 MidTerm I (2/21/2013) Spring 2013 Name: Key INSTRUCTIONS 1. This exam is open book/open notes. All papers (but no electronic devices except for calculators) are allowed. 2. There are 5 pages in
More informationSTAT 461/561- Assignments, Year 2015
STAT 461/561- Assignments, Year 2015 This is the second set of assignment problems. When you hand in any problem, include the problem itself and its number. pdf are welcome. If so, use large fonts and
More informationCIVL /8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8
CIVL - 7904/8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8 Chi-square Test How to determine the interval from a continuous distribution I = Range 1 + 3.322(logN) I-> Range of the class interval
More informationM(t) = 1 t. (1 t), 6 M (0) = 20 P (95. X i 110) i=1
Math 66/566 - Midterm Solutions NOTE: These solutions are for both the 66 and 566 exam. The problems are the same until questions and 5. 1. The moment generating function of a random variable X is M(t)
More informationModels for Clustered Data
Models for Clustered Data Edps/Psych/Soc 589 Carolyn J Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Spring 2019 Outline Notation NELS88 data Fixed Effects ANOVA
More informationModels for Clustered Data
Models for Clustered Data Edps/Psych/Stat 587 Carolyn J Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Fall 2017 Outline Notation NELS88 data Fixed Effects ANOVA
More informationM A N O V A. Multivariate ANOVA. Data
M A N O V A Multivariate ANOVA V. Čekanavičius, G. Murauskas 1 Data k groups; Each respondent has m measurements; Observations are from the multivariate normal distribution. No outliers. Covariance matrices
More informationMULTIVARIATE POPULATIONS
CHAPTER 5 MULTIVARIATE POPULATIONS 5. INTRODUCTION In the following chapters we will be dealing with a variety of problems concerning multivariate populations. The purpose of this chapter is to provide
More informationThe purpose of this section is to derive the asymptotic distribution of the Pearson chi-square statistic. k (n j np j ) 2. np j.
Chapter 9 Pearson s chi-square test 9. Null hypothesis asymptotics Let X, X 2, be independent from a multinomial(, p) distribution, where p is a k-vector with nonnegative entries that sum to one. That
More informationStat 206: Estimation and testing for a mean vector,
Stat 206: Estimation and testing for a mean vector, Part II James Johndrow 2016-12-03 Comparing components of the mean vector In the last part, we talked about testing the hypothesis H 0 : µ 1 = µ 2 where
More informationBayesian Decision Theory
Bayesian Decision Theory Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Fall 2017 CS 551, Fall 2017 c 2017, Selim Aksoy (Bilkent University) 1 / 46 Bayesian
More informationMultilevel Models in Matrix Form. Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2
Multilevel Models in Matrix Form Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Today s Lecture Linear models from a matrix perspective An example of how to do
More informationPhysics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester
Physics 403 Parameter Estimation, Correlations, and Error Bars Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Review of Last Class Best Estimates and Reliability
More informationIntroduction to Business Statistics QM 220 Chapter 12
Department of Quantitative Methods & Information Systems Introduction to Business Statistics QM 220 Chapter 12 Dr. Mohammad Zainal 12.1 The F distribution We already covered this topic in Ch. 10 QM-220,
More informationSTA Module 10 Comparing Two Proportions
STA 2023 Module 10 Comparing Two Proportions Learning Objectives Upon completing this module, you should be able to: 1. Perform large-sample inferences (hypothesis test and confidence intervals) to compare
More informationSTAT 501 Assignment 1 Name Spring Written Assignment: Due Monday, January 22, in class. Please write your answers on this assignment
STAT 5 Assignment Name Spring Reading Assignment: Johnson and Wichern, Chapter, Sections.5 and.6, Chapter, and Chapter. Review matrix operations in Chapter and Supplement A. Examine the matrix properties
More informationIndependent Component (IC) Models: New Extensions of the Multinormal Model
Independent Component (IC) Models: New Extensions of the Multinormal Model Davy Paindaveine (joint with Klaus Nordhausen, Hannu Oja, and Sara Taskinen) School of Public Health, ULB, April 2008 My research
More informationIntroduction to the Analysis of Variance (ANOVA)
Introduction to the Analysis of Variance (ANOVA) The Analysis of Variance (ANOVA) The analysis of variance (ANOVA) is a statistical technique for testing for differences between the means of multiple (more
More informationAnalysis of variance (ANOVA) Comparing the means of more than two groups
Analysis of variance (ANOVA) Comparing the means of more than two groups Example: Cost of mating in male fruit flies Drosophila Treatments: place males with and without unmated (virgin) females Five treatments
More informationMultivariate Statistical Analysis
Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 17 for Applied Multivariate Analysis Outline Multivariate Analysis of Variance 1 Multivariate Analysis of Variance The hypotheses:
More informationLecture 3: Inference in SLR
Lecture 3: Inference in SLR STAT 51 Spring 011 Background Reading KNNL:.1.6 3-1 Topic Overview This topic will cover: Review of hypothesis testing Inference about 1 Inference about 0 Confidence Intervals
More informationConfidence Intervals, Testing and ANOVA Summary
Confidence Intervals, Testing and ANOVA Summary 1 One Sample Tests 1.1 One Sample z test: Mean (σ known) Let X 1,, X n a r.s. from N(µ, σ) or n > 30. Let The test statistic is H 0 : µ = µ 0. z = x µ 0
More informationRank-Based Methods. Lukas Meier
Rank-Based Methods Lukas Meier 20.01.2014 Introduction Up to now we basically always used a parametric family, like the normal distribution N (µ, σ 2 ) for modeling random data. Based on observed data
More informationData are sometimes not compatible with the assumptions of parametric statistical tests (i.e. t-test, regression, ANOVA)
BSTT523 Pagano & Gauvreau Chapter 13 1 Nonparametric Statistics Data are sometimes not compatible with the assumptions of parametric statistical tests (i.e. t-test, regression, ANOVA) In particular, data
More informationMultivariate analysis of variance and covariance
Introduction Multivariate analysis of variance and covariance Univariate ANOVA: have observations from several groups, numerical dependent variable. Ask whether dependent variable has same mean for each
More informationProduct Held at Accelerated Stability Conditions. José G. Ramírez, PhD Amgen Global Quality Engineering 6/6/2013
Modeling Sub-Visible Particle Data Product Held at Accelerated Stability Conditions José G. Ramírez, PhD Amgen Global Quality Engineering 6/6/2013 Outline Sub-Visible Particle (SbVP) Poisson Negative Binomial
More informationTopic 3: Sampling Distributions, Confidence Intervals & Hypothesis Testing. Road Map Sampling Distributions, Confidence Intervals & Hypothesis Testing
Topic 3: Sampling Distributions, Confidence Intervals & Hypothesis Testing ECO22Y5Y: Quantitative Methods in Economics Dr. Nick Zammit University of Toronto Department of Economics Room KN3272 n.zammit
More informationBusiness Analytics and Data Mining Modeling Using R Prof. Gaurav Dixit Department of Management Studies Indian Institute of Technology, Roorkee
Business Analytics and Data Mining Modeling Using R Prof. Gaurav Dixit Department of Management Studies Indian Institute of Technology, Roorkee Lecture - 04 Basic Statistics Part-1 (Refer Slide Time: 00:33)
More informationSample Size and Power Considerations for Longitudinal Studies
Sample Size and Power Considerations for Longitudinal Studies Outline Quantities required to determine the sample size in longitudinal studies Review of type I error, type II error, and power For continuous
More informationGroup comparison test for independent samples
Group comparison test for independent samples The purpose of the Analysis of Variance (ANOVA) is to test for significant differences between means. Supposing that: samples come from normal populations
More informationOne-way ANOVA. Experimental Design. One-way ANOVA
Method to compare more than two samples simultaneously without inflating Type I Error rate (α) Simplicity Few assumptions Adequate for highly complex hypothesis testing 09/30/12 1 Outline of this class
More information