Chater 600 Hotelling s Two- Samle T Introduction This module calculates ower for the Hotelling s two-grou, T-squared (T) test statistic. Hotelling s T is an extension of the univariate two-samle t-test to the case where the number of resonse variables is greater than one. These results may also be obtained using PASS s MANOVA test. Assumtions The following assumtions are made when using Hotelling s T to analyze two grous of data. 1. The resonse variables are continuous.. The residuals follow the multivariate normal robability distribution with mean zero and constant variance-covariance matrix. 3. The subjects are indeendent. Technical Details The formulas used to erform a Hotelling s T ower analysis rovide exact answers if the above assumtions are met. These formulas can be found in many laces. We use the results in Rencher (1998). We refer you to that reference for more details. Two-Grou Technical Details In the two-grou case, sets of N1 observations from grou 1 and N observations from grou are available on resonse variables. We assume that all observations have the multivariate normal distribution with common variance covariance matrix Σ. The mean vectors of the two grous are assumed to be µ 1 and µ under the alternative hyothesis. Under the null hyothesis, these mean vectors are assumed to be equal. The value of T is comuted using the formula T N1N N N y y = S y y 1 1 1+ 1 ( ) ( ), N1+ N s where y 1 and y are the vectors samle mean vectors of the two grous and S l is the ooled samle variancecovariance matrix. To calculate ower we need the non-centrality arameter for this distribution. This non-centrality arameter is defined as follows 600-1
where Hotelling s Two-Samle T N1N λ = µ 1 µ µ 1 µ N1+ N N1N = N1+ N 1 ( ) Σ ( ) = 1 ( µ µ ) Σ ( µ µ ) 1 1 We define as effect size because it rovides a exression for the magnitude of the standardized difference between the null and alternative means. Using this non-centrality arameter, the ower of the Hotelling s T may be calculated for any value of the means and standard deviations. Since there is a simle relationshi between the non-central T and the non-central F, calculations are actually based on the non-central F using the formula where df 1 = df = N1+ N 1 ( F F df 1 df ) β = Pr < α,,, λ Procedure Otions This section describes the otions that are secific to this rocedure. These are located on the Design and Covariance tabs. For more information about the otions of other tabs, go to the Procedure Window chater. Design Tab The Design tab contains many of the otions that you will be rimarily concerned with. Solve For Solve For This otion secifies the arameter to be solved for. When you choose to solve for Samle Size, the rogram searches for the lowest samle size that meets the alha and ower criterion you have secified. Power and Alha Power This otion secifies one or more values for ower. Power is the robability of rejecting a false null hyothesis, and is equal to one minus Beta. Beta is the robability of a tye-ii error, which occurs when a false null hyothesis is not rejected. In this rocedure, a tye-ii error occurs when you fail to reject the null hyothesis of equal means when in fact the means are different. Values must be between zero and one. Historically, the value of 0.80 (Beta = 0.0) was used for ower. Now, 0.90 (Beta = 0.10) is also commonly used. A single value may be entered here or a range of values such as 0.8 to 0.95 by 0.05 may be entered. 600-
Hotelling s Two-Samle T Alha This otion secifies one or more values for the robability of a tye-i error. A tye-i error occurs when a true null hyothesis is rejected. In this rocedure, a tye-i error occurs when you reject the null hyothesis of equal means when in fact the means are equal. Values must be between zero and one. Historically, the value of 0.05 has been used for alha. This means that about one test in twenty will falsely reject the null hyothesis. You should ick a value for alha that reresents the risk of a tye-i error you are willing to take in your exerimental situation. You may enter a range of values such as 0.01 0.05 0.10 or 0.01 to 0.10 by 0.01. Samle Size (When Solving for Samle Size) Grou Allocation Select the otion that describes the constraints on N1 or N or both. The otions are Equal (N1 = N) This selection is used when you wish to have equal samle sizes in each grou. Since you are solving for both samle sizes at once, no additional samle size arameters need to be entered. Enter N, solve for N1 Select this otion when you wish to fix N at some value (or values), and then solve only for N1. Please note that for some values of N, there may not be a value of N1 that is large enough to obtain the desired ower. Enter R = N/N1, solve for N1 and N For this choice, you set a value for the ratio of N to N1, and then PASS determines the needed N1 and N, with this ratio, to obtain the desired ower. An equivalent reresentation of the ratio, R, is N = R * N1. Enter ercentage in Grou 1, solve for N1 and N For this choice, you set a value for the ercentage of the total samle size that is in Grou 1, and then PASS determines the needed N1 and N with this ercentage to obtain the desired ower. N (Samle Size, Grou ) This otion is dislayed if Grou Allocation = Enter N, solve for N1 N is the number of items or individuals samled from the Grou oulation. N must be. You can enter a single value or a series of values. R (Grou Samle Size Ratio) This otion is dislayed only if Grou Allocation = Enter R = N/N1, solve for N1 and N. R is the ratio of N to N1. That is, R = N / N1. Use this value to fix the ratio of N to N1 while solving for N1 and N. Only samle size combinations with this ratio are considered. N is related to N1 by the formula: N = [R N1], where the value [Y] is the next integer Y. 600-3
Hotelling s Two-Samle T For examle, setting R =.0 results in a Grou samle size that is double the samle size in Grou 1 (e.g., N1 = 10 and N = 0, or N1 = 50 and N = 100). R must be greater than 0. If R < 1, then N will be less than N1; if R > 1, then N will be greater than N1. You can enter a single or a series of values. Percent in Grou 1 This otion is dislayed only if Grou Allocation = Enter ercentage in Grou 1, solve for N1 and N. Use this value to fix the ercentage of the total samle size allocated to Grou 1 while solving for N1 and N. Only samle size combinations with this Grou 1 ercentage are considered. Small variations from the secified ercentage may occur due to the discrete nature of samle sizes. The Percent in Grou 1 must be greater than 0 and less than 100. You can enter a single or a series of values. Samle Size (When Not Solving for Samle Size) Grou Allocation Select the otion that describes how individuals in the study will be allocated to Grou 1 and to Grou. The otions are Equal (N1 = N) This selection is used when you wish to have equal samle sizes in each grou. A single er grou samle size will be entered. Enter N1 and N individually This choice ermits you to enter different values for N1 and N. Enter N1 and R, where N = R * N1 Choose this otion to secify a value (or values) for N1, and obtain N as a ratio (multile) of N1. Enter total samle size and ercentage in Grou 1 Choose this otion to secify a value (or values) for the total samle size (N), obtain N1 as a ercentage of N, and then N as N - N1. Samle Size Per Grou This otion is dislayed only if Grou Allocation = Equal (N1 = N). The Samle Size Per Grou is the number of items or individuals samled from each of the Grou 1 and Grou oulations. Since the samle sizes are the same in each grou, this value is the value for N1, and also the value for N. The Samle Size Per Grou must be. You can enter a single value or a series of values. N1 (Samle Size, Grou 1) This otion is dislayed if Grou Allocation = Enter N1 and N individually or Enter N1 and R, where N = R * N1. N1 is the number of items or individuals samled from the Grou 1 oulation. N1 must be. You can enter a single value or a series of values. N (Samle Size, Grou ) This otion is dislayed only if Grou Allocation = Enter N1 and N individually. N is the number of items or individuals samled from the Grou oulation. 600-4
Hotelling s Two-Samle T N must be. You can enter a single value or a series of values. R (Grou Samle Size Ratio) This otion is dislayed only if Grou Allocation = Enter N1 and R, where N = R * N1. R is the ratio of N to N1. That is, R = N/N1 Use this value to obtain N as a multile (or roortion) of N1. N is calculated from N1 using the formula: where the value [Y] is the next integer Y. N=[R x N1], For examle, setting R =.0 results in a Grou samle size that is double the samle size in Grou 1. R must be greater than 0. If R < 1, then N will be less than N1; if R > 1, then N will be greater than N1. You can enter a single value or a series of values. Total Samle Size (N) This otion is dislayed only if Grou Allocation = Enter total samle size and ercentage in Grou 1. This is the total samle size, or the sum of the two grou samle sizes. This value, along with the ercentage of the total samle size in Grou 1, imlicitly defines N1 and N. The total samle size must be greater than one, but ractically, must be greater than 3, since each grou samle size needs to be at least. You can enter a single value or a series of values. Percent in Grou 1 This otion is dislayed only if Grou Allocation = Enter total samle size and ercentage in Grou 1. This value fixes the ercentage of the total samle size allocated to Grou 1. Small variations from the secified ercentage may occur due to the discrete nature of samle sizes. The Percent in Grou 1 must be greater than 0 and less than 100. You can enter a single value or a series of values. Effect Size Resonse Variables Number of Resonse Variables Enter the number of resonse (deendent or Y) variables. For a true multivariate test, this value will be greater than one. The number of mean differences entered in the Mean Differences box or in the Means column must equal this value. If you read-in the covariance matrix from the sreadsheet, the number of columns secified must equal this value. Effect Size Mean Differences Mean Differences (= # of Resonse Vars) Enter a list of values reresenting the mean differences under the alternative hyothesis. Under the null hyothesis, these values are all zero. The values entered here reresent the differences that you want the exeriment (study) to be able to detect. Note that the number of values must match the number of Resonse Variables. 600-5
Hotelling s Two-Samle T If you like, you can enter these values in a column on the sreadsheet. This column is secified using the Means Column otion. When that otion is secified, any values entered here are ignored. Means Differences Column Use this otion to secify the sreadsheet column containing the hyothesized mean differences. The resonse variables are reresented down the rows. The number of rows with data must equal the number of resonse variables. When this otion is used, the 'Mean Differences' box is ignored. You can obtain the sreadsheet by selecting Window, then Data, from the menus. Effect Size Mean Multilier K (Means Multiliers) These values are multilied times the mean differences to give you various effect sizes. A searate ower calculation is generated for each value of K. If you want to ignore this setting, enter 1. Covariance Tab This tab secifies the covariance matrix. Covariance Matrix Secification Secify Which Covariance Matrix Inut Method to Use This otion secifies which method will be used to define the covariance matrix. Standard Deviation and Correlation This otion generates a covariance matrix based on the settings for the standard deviation (SD) and the attern of correlations as secified in the Correlation Pattern and R otions. Covariance Matrix Variables When this otion is selected, the covariance matrix is read in from the columns of the sreadsheet. This is the most flexible method, but secifying a covariance matrix is tedious. You will usually only use this method when a secific covariance is given to you. Note that the sreadsheet is shown by selecting the menus: Window and then Data. Covariance Matrix Secification- Inut Method = Standard Deviation and Correlation The arameters in this section rovide a flexible way to secify Σ, the covariance matrix. Because the covariance matrix is symmetric, it can be reresented as 600-6
Hotelling s Two-Samle T σ11 σ1 σ1 σ Σ = 1 σ σ σ1 σ σ where is the number of resonse variables. σ 1 σ1σ ρ1 σ1σ ρ1 = σ1σ ρ1 σ σ σ ρ σ1σ ρ1 σ σ ρ σ σ1 0 0 1 ρ1 ρ1 0 σ = 0 ρ 1 1 ρ 0 0 σ ρ1 ρ 1 σ1 0 0 0 σ 0 0 0 σ Thus, the covariance matrix can be reresented with comlete generality by secifying the standard deviations σ1, σ,, σ and the correlation matrix 1 ρ1 ρ1 ρ R = 1 1 ρ. ρ1 ρ 1 SD (Common Standard Deviation) This value is used to generate the covariance matrix. This otion secifies a single standard deviation to be used for all resonse variables. The square of this value becomes the diagonal elements of the covariance matrix. Since this is a standard deviation, it must be greater than zero. This otion is only used when the first Covariance Matrix Inut Method is selected. R (Correlation) Secify a correlation to be used in calculating the off-diagonal elements of the covariance matrix. Since this is a correlation, it must be between -1 and 1. This otion is only used when the first Covariance Matrix Inut Method is selected. Secify Correlation Pattern This otion secifies the attern of the correlations in the variance-covariance matrix. Two otions are available: 600-7
Hotelling s Two-Samle T Constant The value of R is used as the constant correlation. For examle, if R = 0.6 and = 6, the correlation matrix would aear as 1 0. 600 0. 600 0. 600 0. 600 0. 600 0. 600 1 0. 600 0. 600 0. 600 0. 600 0. 600 0. 600 1 0. 600 0. 600 0. 600 R = 0. 600 0. 600 0. 600 1 0. 600 0. 600 0. 600 0. 600 0. 600 0. 600 1 0. 600 0. 600 0. 600 0. 600 0. 600 0. 600 1 1st-Order Autocorrelation The value of R is used as the base autocorrelation in a first-order, serial correlation attern. For examle, R = 0.6 and = 6, the correlation matrix would aear as 1 0. 600 0. 360 0. 16 0130. 0. 078 0. 600 1 0. 600 0. 360 0. 16 0130. 0. 360 0. 600 1 0. 600 0. 360 0. 16 R = 0. 16 0. 360 0. 600 1 0. 600 0. 360 0130. 0. 16 0. 360 0. 600 1 0. 600 0. 078 0130. 0. 16 0. 360 0. 600 1 This attern is often chosen as the most realistic when little is known about the correlation attern and the resonses variables are measured across time. Covariance Matrix Secification- Inut Method = Covariance Matrix Variables This otion instructs the rogram to read the covariance matrix from the sreadsheet. Sreadsheet Columns Containing the Covariance Matrix This otion designates the columns on the current sreadsheet holding the covariance matrix. It is used when the Secify Which Covariance Matrix Inut Method to Use otion is set to Covariance Matrix Variables. The number of columns and number of rows must match the number of resonse variable at which the subjects are measured. 600-8
Hotelling s Two-Samle T Examle 1 Power and Validation Rencher (1998) ages 107-108 resents an examle of ower calculations for the two-grou case in which the mean differences and covariance matrix are 3 6 3 3 µ 1 µ =, Σ = 3 5 6 3 3 6 9 When N1 = N = 10, 1, 14, 16 and the significance level is 0.05, Rencher calculated the ower to be 0.6438, 0.750, 0.839, 0.8936, resectively. Setu This section resents the values of each of the arameters needed to run this examle. First, from the PASS Home window, load the Hotelling s Two-Samle T rocedure window by exanding Means, then clicking on Multivariate Means, and then clicking on Hotelling s Two-Samle T. You may then make the aroriate entries as listed below, or oen Examle 1 by going to the File menu and choosing Oen Examle Temlate. You can see that the values have been loaded into the sreadsheet by clicking on the sreadsheet button. Otion Value Design Tab Solve For... Power Alha... 0.05 Grou Allocation... Equal (N1 = N) Samle Size Per Grou... 10 1 14 16 Number of Resonse Variables... 3 Mean Differences... blank Mean Differences Column... Differences K (Means Multilier)... 1.0 Covariance Tab Secify Covariance Method... Covariance Matrix Columns Sreadsheet Columns... VC_1-VC_3 Press the Sreadsheet button to enter the following values into the sreadsheet for columns VC_1 through VC_3 Row 1... 6-3 3 Row... -3 5-6 Row 3... 3-6 9 Reorts Tab Show Numeric Results... Checked Show Means Matrix... Checked Show Covariance Matrix... Checked 600-9
Hotelling s Two-Samle T Outut Click the Calculate button to erform the calculations and generate the following outut. Numeric Reort Multily Means Effect # of Y's Power N1 N N By (K) Alha Size (DF1) DF 0.6443 10 10 0 1.0000 0.050 1.41 3 16 0.75459 1 1 4 1.0000 0.050 1.41 3 0 0.83613 14 14 8 1.0000 0.050 1.41 3 4 0.89360 16 16 3 1.0000 0.050 1.41 3 8 Note that the ower values obtained here are very close to those obtained by Rencher. We feel that our results are more accurate since Rencher s results were obtained by interolation from Tang s tables. Means Section Means Section Name Mean Y1 3.0000 Y -.0000 Y3 3.0000 This reort shows the mean differences that were read in. Variance-Covariance Matrix Section Variance-Covariance Matrix Section Resonse Y1 Y Y3 Y1.4495-0.5477 0.408 Y -0.5477.361-0.8944 Y3 0.408-0.8944 3.0000 SD's on diagonal. Correlations off diagonal. This reort shows the variance-covariance matrix that was read in from the sreadsheet or generated by the settings of on the Covariance tab. The standard deviations are given on the diagonal and the correlations are given off the diagonal. 600-10
Chart Section Hotelling s Two-Samle T This chart shows the relationshi between ower and N1. 600-11