Rejection regions for the bivariate case

Size: px
Start display at page:

Download "Rejection regions for the bivariate case"

Transcription

1 Rejection regions for the bivariate case The rejection region for the T 2 test (and similarly for Z 2 when Σ is known) is the region outside of an ellipse, for which there is a (1-α)% chance that the test statistic would fall for a random data set. If instead you reject the null hypothesis if either of two univiariate t-tests fails, then the rejection rejection is the area outside of a rectangle. STAT476/STAT576 February 20, / 46

2 Rejection regions STAT476/STAT576 February 20, / 46

3 Rejection regions STAT476/STAT576 February 20, / 46

4 Rejection regions The graphic illustrates the idea that for a given α, if you don t correct for multiple testing, then there are points where you don t reject µ 1 = µ 01 and µ 1 = µ 01 (or µ 11 = µ 12 and µ 21 = µ 22 for the two-sample problem) when you should (dark grey). This indicates lower power for the univariate approach. Similarly, the points in the light grey areas are points where you reject the null hypothesis using the univariate approach but shouldn t, and wouldn t using the multivariate approach. The white region indicates cases where both approaches would not reject the null hypothesis. Points outside both the rectangle and the ellipse are cases where both approaches would reject the null hypothesis. STAT476/STAT576 February 20, / 46

5 Properties of the T 2 statistic We need n 1 > p (one sample) or n 1 + n 2 2 > p (two samples) for S and S pl, respectively, to be nonsingular The degrees of freedom is the same as for the univariate t-tests: n 1 for one sample, n 1 + n 2 2 for the two-sample test The density for T 2 is a scaled version of the F distribution, so it is skewed, and you reject H 0 : µ 1 = µ 0 or H 0 : µ 1 = µ 2 for large values of T 2 even though the test is not one-sided The T 2 approach turns out to be equivalent to the likelihood ratio approach, which has good properties on theoretical grounds. STAT476/STAT576 February 20, / 46

6 Tests on individual variables conditional on rejecting the multivariate test As illustrated in the graphic with the dark grey region, there are cases where H 0 : µ 1 = µ 2 is rejected but H 0 : µ 1j = µ 2j is not rejected using univariate t-tests for any particular j. However, it is possible to construct a linear combination of the variables, z = a y, for which a univariate test will reject H 0 : µ z1 = µ z2. We could also write this hypothesis as H 0 : a µ 1 = a µ 2. STAT476/STAT576 February 20, / 46

7 Tests on individual variables conditional on rejecting the multivariate test To test the linear combination, we z j = a y j, j = 1, 2, and the pooled sample variance of z 1 and z 2 is a Sa. Therefore the t test statistic, as a function of a, is: t(a) = z 1 z 2 (1/n1 + 1/n 2 )s 2 z = a y 1 a y 2 (1/n1 + 1/n 2 )a S pl a The value of a which maximizes t 2 (a) is any multiple of a = S 1 (y 1 y 2 ) STAT476/STAT576 February 20, / 46

8 The discriminant function For a = S 1 (y 1 y 2 ), z = a y is called the discriminant function. If H 0 : µ 1 = µ 2 is rejected, then constructing a = S 1 (y 1 y 2 ) can be done to examine the a j values to determine which variables contribute most to the rejection of the null hypothesis. STAT476/STAT576 February 20, / 46

9 Other procedures to follow up a multivariate rejection of H 0 : µ 1 = µ 2 do univariate t-tests with some risk of inflated type I error do univariate t-tests with Bonferroni corrections for α levels. If testing p variables, then use α/p instead of α. For example, if p = 5, then use α =.01 instead of α =.05. Note that the book notates this as t α/2p,n1 +n 2 2, which means t α/(2p),n1 +n 2 2, for the critical t values. Use T α,p,n1 +n 2 2 = Tα,p,n 2 1 +n 2 2. That is use the square root of the critical value of Hotelling s T 2 as a the critical value for univariate t-tests, which is more conservative than Bonferroni corrections, but allows looking at linear combinations as well. There are other methods as well that we will see later in the course. STAT476/STAT576 February 20, / 46

10 Follow up tests The first procedure is too liberal (rejects too easiy), while the second and third tests are too conservative (often don t reject when they should). If you carry out individual tests only after an overall T 2 test says to reject, then you reject less often (because you might not reject on the first step), so it makes the first procedure less liberal and more acceptable, while it makes the second and third tests more conservative. The probability of rejecting one of the p univariate tests is called the overall α or experimentwise error rate. Simulations have been done to compare the performance of these different methods. Here we compare doing univariate t-tests either conditional or not conditional on rejection using T 2. No Bonferroni correction is used. STAT476/STAT576 February 20, / 46

11 Rejection regions STAT476/STAT576 February 20, / 46

12 Follow up tests The results suggest that univariate tests conditional on rejection using T 2 is quite acceptable (still slightly conservative since error rate is less than α), and would be expected to be more powerful than using a Bonferroni correction. STAT476/STAT576 February 20, / 46

13 The discriminant function: example Recall the data from the psychological testing on men and women. For this data, to find the discriminant function, we have a = S 1 pl (y 1 y 2 ) = The linear combination that has the greatest separation of the two groups is a y = y y y y 4 Thus, y 1 an y 3, pictorial inconsistencies and tool recognition, are the variables that separate the men and women samples the most for these data. STAT476/STAT576 February 20, / 46

14 Computing T 2 and matched pairs The book has some sections on how to compute T 2 using other computer output, such as multiple regression output, which we will skip. It is easy enough to compute T 2 these days. The book also has a section on paired differences which will we skip. The basic idea is that we can have multivariate analogues of the matched pairs t-test. This is appropriate for the couples data that we looked at earlier. Basically, you can look at differences within a pair and analyze these as raw data. For the couples data, we could look the age difference and height difference for couples and analyze these two differences as a single bivariate normal sample. STAT476/STAT576 February 20, / 46

15 Test for additional information For a two-sample problem, one thing we may wish to do is to determin, given a set of p + q variables, whether q of the variables are redundant. In other words, does the addition of the q variables significantly increase T 2? Let y j denote the p-variable data and let x j denote the q-variable data where there are two samples, j = 1, 2. The samples can be described as follows: STAT476/STAT576 February 20, / 46

16 Test for additional information STAT476/STAT576 February 20, / 46

17 Test for additional information STAT476/STAT576 February 20, / 46

18 Test for additional information In the previous slide, v = n 1 + n 2 2. T 2 (x y) is distributed as T 2 q,v p, but it is probably easier to convert to an F using T 2 p+q and T 2 p. The F test has q and v p q + 1 numerator and denominator degrees of freedom. The (null) hypothesis that the additional information in x is redundant is rejected for F values that are sufficiently large. STAT476/STAT576 February 20, / 46

19 Example with psychological data For this data set, you could test whether any of the variables is redundant (so do a separate test for each of the four variables, conditional on the other three being present. The critical value is Calculating the statistics for each variable yields: STAT476/STAT576 February 20, / 46

20 Example with psychological data Since variable y 2 has a test statistic lower than 4.002, it appears to not contribute significantly to the separation between men and women, given that the other three variables are present. Thus, if the goal was to create a psychological test that could distinguish between men and women, the other three variables would have been sufficient. On the other hand, given that three other variables were present, y 1, y 3 and y 4 all contributed significantly to increasing T 2 and threfore to the separation between men and women. STAT476/STAT576 February 20, / 46

21 Profile Analysis When multivariate data are the result of repeated measures, you might want to see how the mean is changing over time. As an example, you might have data on blood pressure taken once per month for the same set of patients. It can be a little bit ambiguous whether you think of this as multivariate or univariate data. Suppose we just keep track of systolic blood pressure. You could think of the response as blood pressure, with the time (month) and individual as predictors. In this case, the number of observations is the number of individuals times the number of time points. Or you could think of the number of observations as the number of individuals, with each observation as being multivariate with one variable (or response) for each time point. STAT476/STAT576 February 20, / 46

22 Profile analysis Profile analysis can also be used when variables are not ordered in time (or any other way), but the values are on a similar scale, such as in psychological testing. A profile plot uses the subscript of the variable on the x-axis and the means µ 1,..., µ p on the y-axis. We can do both one-sample and two-sample profile plots. We ll start with one sample. STAT476/STAT576 February 20, / 46

23 Profile plot STAT476/STAT576 February 20, / 46

24 Profile plot The null hypothesis that all means are equal H 0 : µ 1 = = µ 2 can be interpreted graphically as the hypothesis that the profile plot is flat. The alternative hypothesis is H 1 : µ i µ j for some i, j. Note that these hypotheses are the same as for ANOVA, but the analysis will be different because the columns are not assumed to be independent. In ANOVA, you would have p populations that are assumed to be independent samples. Here we have a single population but are sampling correlated variables from this population. STAT476/STAT576 February 20, / 46

25 Profile plot Two equivalent ways of expressing the null hypothesis are µ 1 µ 2 0 H 0 : µ 2 µ 3 = 0 µ p 1 µ p 0 µ 1 µ 2 H 0 : µ 1 µ 3 = µ 1 µ p STAT476/STAT576 February 20, / 46

26 Profile plot The left-hand matrices have entries that are linear combinations of µ 1,..., µ p, so we can represent them using matrix multiplications: C 1 :...., C :...., Here C 1 and C 2 are (p 1) p and have rank p 1. The null hypotheses can then be written as H 0 : C 1 µ = 0 and H 0 : C 2 µ = 0 STAT476/STAT576 February 20, / 46

27 Profile plot More generally, if a matrix satisfies Cj = 0, then the rows of C sum to 0. If C has rank p 1, then Cµ = 0 can only be satisfied if µ 1 = = µ 2. In particular if one of the rows of C, say the ith row, sums to 0, then c i µ is a contrast. For example, we might have c i µ = µ 1 + µ 2 2 µ 1 + µ 2 + µ 3 3 Here C, and C 1 and C 2 from the previous slide, are called contrast matrices. STAT476/STAT576 February 20, / 46

28 Profile plot To test the null hypothesis, we let z = Cy and S z = CSC Assuming y N p (µ, Σ), then z N p 1 (0, CΣC /n) STAT476/STAT576 February 20, / 46

29 Profile plot To convert this to a T 2 squared statistic, you can use T 2 = z (CSC /n) 1 z and this has a Tp 1,n 1 2 distribution. To convert to an F statistic, you can use v p + 1 Tp,v 2 = F p,v p+1 vp (n 1) (p 1) + 1 T 2 = F (p 1)(n 1) p 1,n 1 (p 1)+1 n p + 1 (p 1)(n 1) T 2 = F p 1,n p+1 STAT476/STAT576 February 20, / 46

30 Profile plots: two samples We can also do profile analysis with two plots. Here we have two profiles. Graphically, we might be interested in the following questions: Are the two plots both flat? Are the two plots both parallel? Are the two plots at the same level (is one higher than the other)? STAT476/STAT576 February 20, / 46

31 Profile plot: two samples STAT476/STAT576 February 20, / 46

32 Profile plot: two samples For two samples, the null hypothesis that the two plots have the same slopes (are parallel) can be tested by checking whether differences between means are equal for the two groups. This can be written as H 0 : µ 1,j µ 1,j 1 = µ 2,j µ 2,j 1 for j = 1,..., p. In matrix notation, this is STAT476/STAT576 February 20, / 46

33 Profile analysis: two samples The test statistic is T 2 = (Cy 1 Cy 2 ) [( 1 n n 2 ) CS pl C ] 1 (Cy 1 Cy 2 ) = n 1n 2 n 1 + n 2 (y 1 y 2 ) C [CS pl C ] 1 C(y 1 y 2 ) Note that if you try to distribute the inverse, you might be tempted to have terms like C 1 C, but this incorrect as C doesn t have an inverse since it is not square. The test statistic has a T 2 distribution with p 1 and n 1 + n 2 2 degrees of freedom, which again can be converted to an F. STAT476/STAT576 February 20, / 46

34 Profile analysis: two samples The biggest descrepancy in the slopes can be found by looking at the discriminant function, for which we use a = (CS pl C ) 1 C(y 1 y 2 ) If the largest component of a is a i, then the i slope has largest violation of parallel slopes. This approach is similar to testing for an interaction in two-way ANOVA, but again ANOVA assumes independent populations, where here we allow correlation between the p variables. STAT476/STAT576 February 20, / 46

35 Profile analysis: two samples A second question is whether two samples have the same overall level, ignoring possible interaction. This hypothesis can be expressed as or or H 0 : µ µ 1p = µ µ 2p H 0 : j µ 1 = j µ 2 H 0 : j (µ 1 µ 2 ) = 0 This null hypothesis can be false even when there is no interaction, such as when the plots are parallel but one plot is higher than the higher. On the other hand, when there is interaction, but the two plots have the same average value, this null hypothesis can be satisfied. STAT476/STAT576 February 20, / 46

36 Profile plot: two samples STAT476/STAT576 February 20, / 46

37 Profile analysis: two samples In this case the test statistic is t = j (y 1 y 2 ) j S pl j(1/n 1 + 1/n 2 ) which has a t distribution with n 1 + n 2 2 degrees of freedom. STAT476/STAT576 February 20, / 46

38 Profile analysis: two samples Finally, a third hypothesis is that the two profiles are both flat, which would imply that there is no interaction but is stronger in requiring that the slopes also don t change. This hypothesis does not require the same overall mean for the two profiles. STAT476/STAT576 February 20, / 46

39 Profile analysis: two samples Because lack of parallelism automatically implies that at least one curve is not flat, the hypothesis that both curves are flat is more interesting if the hypothesis of parallelism can t be rejected. If it is rejected, then you could test individually whether each curve was flat using a one-sample technique. If the curves are roughly parallel, then it makes sense to test for flatness (if this is of interest). This is a bit like testing for a main effect in an ANOVA after finding no interaction. If there is an interaction, then you normally wouldn t also test for a main effect. STAT476/STAT576 February 20, / 46

40 Profile plot: two samples STAT476/STAT576 February 20, / 46

41 Profile plot: two samples Recall the example of the psychological profile of men and women: STAT476/STAT576 February 20, / 46

42 Profile plot: two samples STAT476/STAT576 February 20, / 46

43 Profile plot: two samples STAT476/STAT576 February 20, / 46

44 Profile plot: two samples For this data, the profile plot appears to not be very parallele, and one group has consistently larger values than the other at each variable. To formally do some tests, we can test for parallelism using C = Then the null is H 0 : C(µ 1 µ 2 ) = 0 STAT476/STAT576 February 20, / 46

45 Profile plot: two samples STAT476/STAT576 February 20, / 46

46 Profile plot: two samples The book uses the T 2 table to compare this to a critical value. You can do this or convert to an F. To convert to an F, we again use v p + 1 T 2 = F p,v vp Here we use p 1 in place of p because p 1 is the rank of C, and we use v = n 1 + n 2 2 This yields F = n 1 + n 2 2 (p 1) + 1 T 2 = (n 1 + n 2 2)(p 1) This gives a p-value of about > 1-pf(23.94,3,62) [1] e-10 so there is clear evidence against parallelism T 2 = = STAT476/STAT576 February 20, / 46

47 HW2: due in two weeks: March 2nd (continues to the next slide) STAT476/STAT576 February 20, / 46

48 HW2: due in two weeks: March 2nd STAT476/STAT576 February 20, / 46

STA 437: Applied Multivariate Statistics

STA 437: Applied Multivariate Statistics Al Nosedal. University of Toronto. Winter 2015 1 Chapter 5. Tests on One or Two Mean Vectors If you can t explain it simply, you don t understand it well enough Albert Einstein. Definition Chapter 5. Tests

More information

Multivariate Regression (Chapter 10)

Multivariate Regression (Chapter 10) Multivariate Regression (Chapter 10) This week we ll cover multivariate regression and maybe a bit of canonical correlation. Today we ll mostly review univariate multivariate regression. With multivariate

More information

Inferences about a Mean Vector

Inferences about a Mean Vector Inferences about a Mean Vector Edps/Soc 584, Psych 594 Carolyn J. Anderson Department of Educational Psychology I L L I N O I S university of illinois at urbana-champaign c Board of Trustees, University

More information

Multivariate Statistical Analysis

Multivariate Statistical Analysis Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 13 for Applied Multivariate Analysis Outline Two sample Profile Analysis-reprise 1 Two sample Profile Analysis-reprise Two sample

More information

Review of Statistics 101

Review of Statistics 101 Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods

More information

Your schedule of coming weeks. One-way ANOVA, II. Review from last time. Review from last time /22/2004. Create ANOVA table

Your schedule of coming weeks. One-way ANOVA, II. Review from last time. Review from last time /22/2004. Create ANOVA table Your schedule of coming weeks One-way ANOVA, II 9.07 //00 Today: One-way ANOVA, part II Next week: Two-way ANOVA, parts I and II. One-way ANOVA HW due Thursday Week of May Teacher out of town all week

More information

Introduction to the Analysis of Variance (ANOVA) Computing One-Way Independent Measures (Between Subjects) ANOVAs

Introduction to the Analysis of Variance (ANOVA) Computing One-Way Independent Measures (Between Subjects) ANOVAs Introduction to the Analysis of Variance (ANOVA) Computing One-Way Independent Measures (Between Subjects) ANOVAs The Analysis of Variance (ANOVA) The analysis of variance (ANOVA) is a statistical technique

More information

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Comparisons of Two Means Edps/Soc 584 and Psych 594 Applied Multivariate Statistics Carolyn J. Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN c

More information

Introduction to the Analysis of Variance (ANOVA)

Introduction to the Analysis of Variance (ANOVA) Introduction to the Analysis of Variance (ANOVA) The Analysis of Variance (ANOVA) The analysis of variance (ANOVA) is a statistical technique for testing for differences between the means of multiple (more

More information

One sided tests. An example of a two sided alternative is what we ve been using for our two sample tests:

One sided tests. An example of a two sided alternative is what we ve been using for our two sample tests: One sided tests So far all of our tests have been two sided. While this may be a bit easier to understand, this is often not the best way to do a hypothesis test. One simple thing that we can do to get

More information

Chapter 7. Inference for Distributions. Introduction to the Practice of STATISTICS SEVENTH. Moore / McCabe / Craig. Lecture Presentation Slides

Chapter 7. Inference for Distributions. Introduction to the Practice of STATISTICS SEVENTH. Moore / McCabe / Craig. Lecture Presentation Slides Chapter 7 Inference for Distributions Introduction to the Practice of STATISTICS SEVENTH EDITION Moore / McCabe / Craig Lecture Presentation Slides Chapter 7 Inference for Distributions 7.1 Inference for

More information

Multivariate Statistical Analysis

Multivariate Statistical Analysis Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 9 for Applied Multivariate Analysis Outline Two sample T 2 test 1 Two sample T 2 test 2 Analogous to the univariate context, we

More information

The t-test: A z-score for a sample mean tells us where in the distribution the particular mean lies

The t-test: A z-score for a sample mean tells us where in the distribution the particular mean lies The t-test: So Far: Sampling distribution benefit is that even if the original population is not normal, a sampling distribution based on this population will be normal (for sample size > 30). Benefit

More information

Chapter Seven: Multi-Sample Methods 1/52

Chapter Seven: Multi-Sample Methods 1/52 Chapter Seven: Multi-Sample Methods 1/52 7.1 Introduction 2/52 Introduction The independent samples t test and the independent samples Z test for a difference between proportions are designed to analyze

More information

The One-Way Independent-Samples ANOVA. (For Between-Subjects Designs)

The One-Way Independent-Samples ANOVA. (For Between-Subjects Designs) The One-Way Independent-Samples ANOVA (For Between-Subjects Designs) Computations for the ANOVA In computing the terms required for the F-statistic, we won t explicitly compute any sample variances or

More information

Multilevel Models in Matrix Form. Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2

Multilevel Models in Matrix Form. Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Multilevel Models in Matrix Form Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Today s Lecture Linear models from a matrix perspective An example of how to do

More information

One-Way ANOVA. Some examples of when ANOVA would be appropriate include:

One-Way ANOVA. Some examples of when ANOVA would be appropriate include: One-Way ANOVA 1. Purpose Analysis of variance (ANOVA) is used when one wishes to determine whether two or more groups (e.g., classes A, B, and C) differ on some outcome of interest (e.g., an achievement

More information

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Introduction Edps/Psych/Stat/ 584 Applied Multivariate Statistics Carolyn J Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN c Board of Trustees,

More information

You can compute the maximum likelihood estimate for the correlation

You can compute the maximum likelihood estimate for the correlation Stat 50 Solutions Comments on Assignment Spring 005. (a) _ 37.6 X = 6.5 5.8 97.84 Σ = 9.70 4.9 9.70 75.05 7.80 4.9 7.80 4.96 (b) 08.7 0 S = Σ = 03 9 6.58 03 305.6 30.89 6.58 30.89 5.5 (c) You can compute

More information

Lecture 4: Testing Stuff

Lecture 4: Testing Stuff Lecture 4: esting Stuff. esting Hypotheses usually has three steps a. First specify a Null Hypothesis, usually denoted, which describes a model of H 0 interest. Usually, we express H 0 as a restricted

More information

HYPOTHESIS TESTING. Hypothesis Testing

HYPOTHESIS TESTING. Hypothesis Testing MBA 605 Business Analytics Don Conant, PhD. HYPOTHESIS TESTING Hypothesis testing involves making inferences about the nature of the population on the basis of observations of a sample drawn from the population.

More information

[y i α βx i ] 2 (2) Q = i=1

[y i α βx i ] 2 (2) Q = i=1 Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation

More information

DESAIN EKSPERIMEN Analysis of Variances (ANOVA) Semester Genap 2017/2018 Jurusan Teknik Industri Universitas Brawijaya

DESAIN EKSPERIMEN Analysis of Variances (ANOVA) Semester Genap 2017/2018 Jurusan Teknik Industri Universitas Brawijaya DESAIN EKSPERIMEN Analysis of Variances (ANOVA) Semester Jurusan Teknik Industri Universitas Brawijaya Outline Introduction The Analysis of Variance Models for the Data Post-ANOVA Comparison of Means Sample

More information

Multivariate Statistical Analysis

Multivariate Statistical Analysis Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 9 for Applied Multivariate Analysis Outline Addressing ourliers 1 Addressing ourliers 2 Outliers in Multivariate samples (1) For

More information

STAT 135 Lab 9 Multiple Testing, One-Way ANOVA and Kruskal-Wallis

STAT 135 Lab 9 Multiple Testing, One-Way ANOVA and Kruskal-Wallis STAT 135 Lab 9 Multiple Testing, One-Way ANOVA and Kruskal-Wallis Rebecca Barter April 6, 2015 Multiple Testing Multiple Testing Recall that when we were doing two sample t-tests, we were testing the equality

More information

Review: General Approach to Hypothesis Testing. 1. Define the research question and formulate the appropriate null and alternative hypotheses.

Review: General Approach to Hypothesis Testing. 1. Define the research question and formulate the appropriate null and alternative hypotheses. 1 Review: Let X 1, X,..., X n denote n independent random variables sampled from some distribution might not be normal!) with mean µ) and standard deviation σ). Then X µ σ n In other words, X is approximately

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Suhasini Subba Rao Motivations for the ANOVA We defined the F-distribution, this is mainly used in

More information

STAT 501 Assignment 2 NAME Spring Chapter 5, and Sections in Johnson & Wichern.

STAT 501 Assignment 2 NAME Spring Chapter 5, and Sections in Johnson & Wichern. STAT 01 Assignment NAME Spring 00 Reading Assignment: Written Assignment: Chapter, and Sections 6.1-6.3 in Johnson & Wichern. Due Monday, February 1, in class. You should be able to do the first four problems

More information

CHAPTER 9: HYPOTHESIS TESTING

CHAPTER 9: HYPOTHESIS TESTING CHAPTER 9: HYPOTHESIS TESTING THE SECOND LAST EXAMPLE CLEARLY ILLUSTRATES THAT THERE IS ONE IMPORTANT ISSUE WE NEED TO EXPLORE: IS THERE (IN OUR TWO SAMPLES) SUFFICIENT STATISTICAL EVIDENCE TO CONCLUDE

More information

Chapter 7: Hypothesis testing

Chapter 7: Hypothesis testing Chapter 7: Hypothesis testing Hypothesis testing is typically done based on the cumulative hazard function. Here we ll use the Nelson-Aalen estimate of the cumulative hazard. The survival function is used

More information

Hypothesis testing I. - In particular, we are talking about statistical hypotheses. [get everyone s finger length!] n =

Hypothesis testing I. - In particular, we are talking about statistical hypotheses. [get everyone s finger length!] n = Hypothesis testing I I. What is hypothesis testing? [Note we re temporarily bouncing around in the book a lot! Things will settle down again in a week or so] - Exactly what it says. We develop a hypothesis,

More information

Analysing data: regression and correlation S6 and S7

Analysing data: regression and correlation S6 and S7 Basic medical statistics for clinical and experimental research Analysing data: regression and correlation S6 and S7 K. Jozwiak k.jozwiak@nki.nl 2 / 49 Correlation So far we have looked at the association

More information

Methods for Identifying Out-of-Trend Data in Analysis of Stability Measurements Part II: By-Time-Point and Multivariate Control Chart

Methods for Identifying Out-of-Trend Data in Analysis of Stability Measurements Part II: By-Time-Point and Multivariate Control Chart Peer-Reviewed Methods for Identifying Out-of-Trend Data in Analysis of Stability Measurements Part II: By-Time-Point and Multivariate Control Chart Máté Mihalovits and Sándor Kemény T his article is a

More information

22s:152 Applied Linear Regression. Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA)

22s:152 Applied Linear Regression. Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA) 22s:152 Applied Linear Regression Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA) We now consider an analysis with only categorical predictors (i.e. all predictors are

More information

(Where does Ch. 7 on comparing 2 means or 2 proportions fit into this?)

(Where does Ch. 7 on comparing 2 means or 2 proportions fit into this?) 12. Comparing Groups: Analysis of Variance (ANOVA) Methods Response y Explanatory x var s Method Categorical Categorical Contingency tables (Ch. 8) (chi-squared, etc.) Quantitative Quantitative Regression

More information

AMS7: WEEK 7. CLASS 1. More on Hypothesis Testing Monday May 11th, 2015

AMS7: WEEK 7. CLASS 1. More on Hypothesis Testing Monday May 11th, 2015 AMS7: WEEK 7. CLASS 1 More on Hypothesis Testing Monday May 11th, 2015 Testing a Claim about a Standard Deviation or a Variance We want to test claims about or 2 Example: Newborn babies from mothers taking

More information

Within Cases. The Humble t-test

Within Cases. The Humble t-test Within Cases The Humble t-test 1 / 21 Overview The Issue Analysis Simulation Multivariate 2 / 21 Independent Observations Most statistical models assume independent observations. Sometimes the assumption

More information

Statistical Distribution Assumptions of General Linear Models

Statistical Distribution Assumptions of General Linear Models Statistical Distribution Assumptions of General Linear Models Applied Multilevel Models for Cross Sectional Data Lecture 4 ICPSR Summer Workshop University of Colorado Boulder Lecture 4: Statistical Distributions

More information

ANOVA Analysis of Variance

ANOVA Analysis of Variance ANOVA Analysis of Variance ANOVA Analysis of Variance Extends independent samples t test ANOVA Analysis of Variance Extends independent samples t test Compares the means of groups of independent observations

More information

Hypothesis testing. Data to decisions

Hypothesis testing. Data to decisions Hypothesis testing Data to decisions The idea Null hypothesis: H 0 : the DGP/population has property P Under the null, a sample statistic has a known distribution If, under that that distribution, the

More information

Profile Analysis Multivariate Regression

Profile Analysis Multivariate Regression Lecture 8 October 12, 2005 Analysis Lecture #8-10/12/2005 Slide 1 of 68 Today s Lecture Profile analysis Today s Lecture Schedule : regression review multiple regression is due Thursday, October 27th,

More information

Applied Multivariate and Longitudinal Data Analysis

Applied Multivariate and Longitudinal Data Analysis Applied Multivariate and Longitudinal Data Analysis Chapter 2: Inference about the mean vector(s) Ana-Maria Staicu SAS Hall 5220; 919-515-0644; astaicu@ncsu.edu 1 In this chapter we will discuss inference

More information

Linear Regression. Chapter 3

Linear Regression. Chapter 3 Chapter 3 Linear Regression Once we ve acquired data with multiple variables, one very important question is how the variables are related. For example, we could ask for the relationship between people

More information

Lectures 5 & 6: Hypothesis Testing

Lectures 5 & 6: Hypothesis Testing Lectures 5 & 6: Hypothesis Testing in which you learn to apply the concept of statistical significance to OLS estimates, learn the concept of t values, how to use them in regression work and come across

More information

COMPARING SEVERAL MEANS: ANOVA

COMPARING SEVERAL MEANS: ANOVA LAST UPDATED: November 15, 2012 COMPARING SEVERAL MEANS: ANOVA Objectives 2 Basic principles of ANOVA Equations underlying one-way ANOVA Doing a one-way ANOVA in R Following up an ANOVA: Planned contrasts/comparisons

More information

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages: Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the

More information

2. TRUE or FALSE: Converting the units of one measured variable alters the correlation of between it and a second variable.

2. TRUE or FALSE: Converting the units of one measured variable alters the correlation of between it and a second variable. 1. The diagnostic plots shown below are from a linear regression that models a patient s score from the SUG-HIGH diabetes risk model as function of their normalized LDL level. a. Based on these plots,

More information

Design & Analysis of Experiments 7E 2009 Montgomery

Design & Analysis of Experiments 7E 2009 Montgomery 1 What If There Are More Than Two Factor Levels? The t-test does not directly apply ppy There are lots of practical situations where there are either more than two levels of interest, or there are several

More information

Lecture 5: Hypothesis tests for more than one sample

Lecture 5: Hypothesis tests for more than one sample 1/23 Lecture 5: Hypothesis tests for more than one sample Måns Thulin Department of Mathematics, Uppsala University thulin@math.uu.se Multivariate Methods 8/4 2011 2/23 Outline Paired comparisons Repeated

More information

Data Analyses in Multivariate Regression Chii-Dean Joey Lin, SDSU, San Diego, CA

Data Analyses in Multivariate Regression Chii-Dean Joey Lin, SDSU, San Diego, CA Data Analyses in Multivariate Regression Chii-Dean Joey Lin, SDSU, San Diego, CA ABSTRACT Regression analysis is one of the most used statistical methodologies. It can be used to describe or predict causal

More information

y ˆ i = ˆ " T u i ( i th fitted value or i th fit)

y ˆ i = ˆ  T u i ( i th fitted value or i th fit) 1 2 INFERENCE FOR MULTIPLE LINEAR REGRESSION Recall Terminology: p predictors x 1, x 2,, x p Some might be indicator variables for categorical variables) k-1 non-constant terms u 1, u 2,, u k-1 Each u

More information

9 Correlation and Regression

9 Correlation and Regression 9 Correlation and Regression SW, Chapter 12. Suppose we select n = 10 persons from the population of college seniors who plan to take the MCAT exam. Each takes the test, is coached, and then retakes the

More information

In a one-way ANOVA, the total sums of squares among observations is partitioned into two components: Sums of squares represent:

In a one-way ANOVA, the total sums of squares among observations is partitioned into two components: Sums of squares represent: Activity #10: AxS ANOVA (Repeated subjects design) Resources: optimism.sav So far in MATH 300 and 301, we have studied the following hypothesis testing procedures: 1) Binomial test, sign-test, Fisher s

More information

Lecture 10: F -Tests, ANOVA and R 2

Lecture 10: F -Tests, ANOVA and R 2 Lecture 10: F -Tests, ANOVA and R 2 1 ANOVA We saw that we could test the null hypothesis that β 1 0 using the statistic ( β 1 0)/ŝe. (Although I also mentioned that confidence intervals are generally

More information

One-Way Analysis of Variance. With regression, we related two quantitative, typically continuous variables.

One-Way Analysis of Variance. With regression, we related two quantitative, typically continuous variables. One-Way Analysis of Variance With regression, we related two quantitative, typically continuous variables. Often we wish to relate a quantitative response variable with a qualitative (or simply discrete)

More information

Statistics for IT Managers

Statistics for IT Managers Statistics for IT Managers 95-796, Fall 2012 Module 2: Hypothesis Testing and Statistical Inference (5 lectures) Reading: Statistics for Business and Economics, Ch. 5-7 Confidence intervals Given the sample

More information

(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box.

(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box. FINAL EXAM ** Two different ways to submit your answer sheet (i) Use MS-Word and place it in a drop-box. (ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box. Deadline: December

More information

Stat 206: Estimation and testing for a mean vector,

Stat 206: Estimation and testing for a mean vector, Stat 206: Estimation and testing for a mean vector, Part II James Johndrow 2016-12-03 Comparing components of the mean vector In the last part, we talked about testing the hypothesis H 0 : µ 1 = µ 2 where

More information

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Canonical Edps/Soc 584 and Psych 594 Applied Multivariate Statistics Carolyn J. Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Canonical Slide

More information

MATH5745 Multivariate Methods Lecture 07

MATH5745 Multivariate Methods Lecture 07 MATH5745 Multivariate Methods Lecture 07 Tests of hypothesis on covariance matrix March 16, 2018 MATH5745 Multivariate Methods Lecture 07 March 16, 2018 1 / 39 Test on covariance matrices: Introduction

More information

Hypothesis testing: Steps

Hypothesis testing: Steps Review for Exam 2 Hypothesis testing: Steps Repeated-Measures ANOVA 1. Determine appropriate test and hypotheses 2. Use distribution table to find critical statistic value(s) representing rejection region

More information

Lectures on Simple Linear Regression Stat 431, Summer 2012

Lectures on Simple Linear Regression Stat 431, Summer 2012 Lectures on Simple Linear Regression Stat 43, Summer 0 Hyunseung Kang July 6-8, 0 Last Updated: July 8, 0 :59PM Introduction Previously, we have been investigating various properties of the population

More information

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis. 401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis

More information

Sampling Distributions: Central Limit Theorem

Sampling Distributions: Central Limit Theorem Review for Exam 2 Sampling Distributions: Central Limit Theorem Conceptually, we can break up the theorem into three parts: 1. The mean (µ M ) of a population of sample means (M) is equal to the mean (µ)

More information

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS Page 1 MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level

More information

Formal Statement of Simple Linear Regression Model

Formal Statement of Simple Linear Regression Model Formal Statement of Simple Linear Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters X i is a known constant, the value of the predictor

More information

Unit 27 One-Way Analysis of Variance

Unit 27 One-Way Analysis of Variance Unit 27 One-Way Analysis of Variance Objectives: To perform the hypothesis test in a one-way analysis of variance for comparing more than two population means Recall that a two sample t test is applied

More information

Theorems. Least squares regression

Theorems. Least squares regression Theorems In this assignment we are trying to classify AML and ALL samples by use of penalized logistic regression. Before we indulge on the adventure of classification we should first explain the most

More information

Dealing with the assumption of independence between samples - introducing the paired design.

Dealing with the assumption of independence between samples - introducing the paired design. Dealing with the assumption of independence between samples - introducing the paired design. a) Suppose you deliberately collect one sample and measure something. Then you collect another sample in such

More information

22s:152 Applied Linear Regression. 1-way ANOVA visual:

22s:152 Applied Linear Regression. 1-way ANOVA visual: 22s:152 Applied Linear Regression 1-way ANOVA visual: Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA) 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 Y We now consider an analysis

More information

Mathematics for Economics MA course

Mathematics for Economics MA course Mathematics for Economics MA course Simple Linear Regression Dr. Seetha Bandara Simple Regression Simple linear regression is a statistical method that allows us to summarize and study relationships between

More information

Regression models. Categorical covariate, Quantitative outcome. Examples of categorical covariates. Group characteristics. Faculty of Health Sciences

Regression models. Categorical covariate, Quantitative outcome. Examples of categorical covariates. Group characteristics. Faculty of Health Sciences Faculty of Health Sciences Categorical covariate, Quantitative outcome Regression models Categorical covariate, Quantitative outcome Lene Theil Skovgaard April 29, 2013 PKA & LTS, Sect. 3.2, 3.2.1 ANOVA

More information

The problem of base rates

The problem of base rates Psychology 205: Research Methods in Psychology William Revelle Department of Psychology Northwestern University Evanston, Illinois USA October, 2015 1 / 14 Outline Inferential statistics 2 / 14 Hypothesis

More information

Binary Logistic Regression

Binary Logistic Regression The coefficients of the multiple regression model are estimated using sample data with k independent variables Estimated (or predicted) value of Y Estimated intercept Estimated slope coefficients Ŷ = b

More information

General Principles Within-Cases Factors Only Within and Between. Within Cases ANOVA. Part One

General Principles Within-Cases Factors Only Within and Between. Within Cases ANOVA. Part One Within Cases ANOVA Part One 1 / 25 Within Cases A case contributes a DV value for every value of a categorical IV It is natural to expect data from the same case to be correlated - NOT independent For

More information

y response variable x 1, x 2,, x k -- a set of explanatory variables

y response variable x 1, x 2,, x k -- a set of explanatory variables 11. Multiple Regression and Correlation y response variable x 1, x 2,, x k -- a set of explanatory variables In this chapter, all variables are assumed to be quantitative. Chapters 12-14 show how to incorporate

More information

Biostatistics 4: Trends and Differences

Biostatistics 4: Trends and Differences Biostatistics 4: Trends and Differences Dr. Jessica Ketchum, PhD. email: McKinneyJL@vcu.edu Objectives 1) Know how to see the strength, direction, and linearity of relationships in a scatter plot 2) Interpret

More information

Contrasts and Multiple Comparisons Supplement for Pages

Contrasts and Multiple Comparisons Supplement for Pages Contrasts and Multiple Comparisons Supplement for Pages 302-323 Brian Habing University of South Carolina Last Updated: July 20, 2001 The F-test from the ANOVA table allows us to test the null hypothesis

More information

Chapter 7, continued: MANOVA

Chapter 7, continued: MANOVA Chapter 7, continued: MANOVA The Multivariate Analysis of Variance (MANOVA) technique extends Hotelling T 2 test that compares two mean vectors to the setting in which there are m 2 groups. We wish to

More information

3d scatterplots. You can also make 3d scatterplots, although these are less common than scatterplot matrices.

3d scatterplots. You can also make 3d scatterplots, although these are less common than scatterplot matrices. 3d scatterplots You can also make 3d scatterplots, although these are less common than scatterplot matrices. > library(scatterplot3d) > y par(mfrow=c(2,2)) > scatterplot3d(y,highlight.3d=t,angle=20)

More information

Hypothesis testing: Steps

Hypothesis testing: Steps Review for Exam 2 Hypothesis testing: Steps Exam 2 Review 1. Determine appropriate test and hypotheses 2. Use distribution table to find critical statistic value(s) representing rejection region 3. Compute

More information

Sociology 6Z03 Review II

Sociology 6Z03 Review II Sociology 6Z03 Review II John Fox McMaster University Fall 2016 John Fox (McMaster University) Sociology 6Z03 Review II Fall 2016 1 / 35 Outline: Review II Probability Part I Sampling Distributions Probability

More information

Statistics: revision

Statistics: revision NST 1B Experimental Psychology Statistics practical 5 Statistics: revision Rudolf Cardinal & Mike Aitken 29 / 30 April 2004 Department of Experimental Psychology University of Cambridge Handouts: Answers

More information

Lecture 6: Single-classification multivariate ANOVA (k-group( MANOVA)

Lecture 6: Single-classification multivariate ANOVA (k-group( MANOVA) Lecture 6: Single-classification multivariate ANOVA (k-group( MANOVA) Rationale and MANOVA test statistics underlying principles MANOVA assumptions Univariate ANOVA Planned and unplanned Multivariate ANOVA

More information

Group comparison test for independent samples

Group comparison test for independent samples Group comparison test for independent samples The purpose of the Analysis of Variance (ANOVA) is to test for significant differences between means. Supposing that: samples come from normal populations

More information

STA2601. Tutorial letter 203/2/2017. Applied Statistics II. Semester 2. Department of Statistics STA2601/203/2/2017. Solutions to Assignment 03

STA2601. Tutorial letter 203/2/2017. Applied Statistics II. Semester 2. Department of Statistics STA2601/203/2/2017. Solutions to Assignment 03 STA60/03//07 Tutorial letter 03//07 Applied Statistics II STA60 Semester Department of Statistics Solutions to Assignment 03 Define tomorrow. university of south africa QUESTION (a) (i) The normal quantile

More information

Lecture 13 Extra Sums of Squares

Lecture 13 Extra Sums of Squares Lecture 13 Extra Sums of Squares STAT 512 Spring 2011 Background Reading KNNL: 7.1-7.4 13-1 Topic Overview Extra Sums of Squares (Defined) Using and Interpreting R 2 and Partial-R 2 Getting ESS and Partial-R

More information

Stat 135, Fall 2006 A. Adhikari HOMEWORK 10 SOLUTIONS

Stat 135, Fall 2006 A. Adhikari HOMEWORK 10 SOLUTIONS Stat 135, Fall 2006 A. Adhikari HOMEWORK 10 SOLUTIONS 1a) The model is cw i = β 0 + β 1 el i + ɛ i, where cw i is the weight of the ith chick, el i the length of the egg from which it hatched, and ɛ i

More information

Answers in blue. If you have questions or spot an error, let me know. 1. Find all matrices that commute with A =. 4 3

Answers in blue. If you have questions or spot an error, let me know. 1. Find all matrices that commute with A =. 4 3 Answers in blue. If you have questions or spot an error, let me know. 3 4. Find all matrices that commute with A =. 4 3 a b If we set B = and set AB = BA, we see that 3a + 4b = 3a 4c, 4a + 3b = 3b 4d,

More information

Confidence Intervals, Testing and ANOVA Summary

Confidence Intervals, Testing and ANOVA Summary Confidence Intervals, Testing and ANOVA Summary 1 One Sample Tests 1.1 One Sample z test: Mean (σ known) Let X 1,, X n a r.s. from N(µ, σ) or n > 30. Let The test statistic is H 0 : µ = µ 0. z = x µ 0

More information

Hypothesis Testing. Hypothesis: conjecture, proposition or statement based on published literature, data, or a theory that may or may not be true

Hypothesis Testing. Hypothesis: conjecture, proposition or statement based on published literature, data, or a theory that may or may not be true Hypothesis esting Hypothesis: conjecture, proposition or statement based on published literature, data, or a theory that may or may not be true Statistical Hypothesis: conjecture about a population parameter

More information

Power Analysis. Ben Kite KU CRMDA 2015 Summer Methodology Institute

Power Analysis. Ben Kite KU CRMDA 2015 Summer Methodology Institute Power Analysis Ben Kite KU CRMDA 2015 Summer Methodology Institute Created by Terrence D. Jorgensen, 2014 Recall Hypothesis Testing? Null Hypothesis Significance Testing (NHST) is the most common application

More information

Last week: Sample, population and sampling distributions finished with estimation & confidence intervals

Last week: Sample, population and sampling distributions finished with estimation & confidence intervals Past weeks: Measures of central tendency (mean, mode, median) Measures of dispersion (standard deviation, variance, range, etc). Working with the normal curve Last week: Sample, population and sampling

More information

Lecture 5: ANOVA and Correlation

Lecture 5: ANOVA and Correlation Lecture 5: ANOVA and Correlation Ani Manichaikul amanicha@jhsph.edu 23 April 2007 1 / 62 Comparing Multiple Groups Continous data: comparing means Analysis of variance Binary data: comparing proportions

More information

ANOVA TESTING 4STEPS. 1. State the hypothesis. : H 0 : µ 1 =

ANOVA TESTING 4STEPS. 1. State the hypothesis. : H 0 : µ 1 = Introduction to Statistics in Psychology PSY 201 Professor Greg Francis Lecture 35 ANalysis Of VAriance Ignoring (some) variability TESTING 4STEPS 1. State the hypothesis. : H 0 : µ 1 = µ 2 =... = µ K,

More information

Chapter 6. Logistic Regression. 6.1 A linear model for the log odds

Chapter 6. Logistic Regression. 6.1 A linear model for the log odds Chapter 6 Logistic Regression In logistic regression, there is a categorical response variables, often coded 1=Yes and 0=No. Many important phenomena fit this framework. The patient survives the operation,

More information

Multiple Testing. Gary W. Oehlert. January 28, School of Statistics University of Minnesota

Multiple Testing. Gary W. Oehlert. January 28, School of Statistics University of Minnesota Multiple Testing Gary W. Oehlert School of Statistics University of Minnesota January 28, 2016 Background Suppose that you had a 20-sided die. Nineteen of the sides are labeled 0 and one of the sides is

More information

Mean Vector Inferences

Mean Vector Inferences Mean Vector Inferences Lecture 5 September 21, 2005 Multivariate Analysis Lecture #5-9/21/2005 Slide 1 of 34 Today s Lecture Inferences about a Mean Vector (Chapter 5). Univariate versions of mean vector

More information

9 One-Way Analysis of Variance

9 One-Way Analysis of Variance 9 One-Way Analysis of Variance SW Chapter 11 - all sections except 6. The one-way analysis of variance (ANOVA) is a generalization of the two sample t test to k 2 groups. Assume that the populations of

More information

Correlation and Simple Linear Regression

Correlation and Simple Linear Regression Correlation and Simple Linear Regression Sasivimol Rattanasiri, Ph.D Section for Clinical Epidemiology and Biostatistics Ramathibodi Hospital, Mahidol University E-mail: sasivimol.rat@mahidol.ac.th 1 Outline

More information