Profile Analysis Multivariate Regression

Size: px

Start display at page:

Download "Profile Analysis Multivariate Regression"

Harold Riley
6 years ago
Views:

1 Lecture 8 October 12, 2005 Analysis Lecture #8-10/12/2005 Slide 1 of 68

2 Today s Lecture Profile analysis Today s Lecture Schedule : regression review multiple regression is due Thursday, October 27th, at 5pm in my mailbox Lecture #8-10/12/2005 Slide 2 of 68

3 Schedule 10/19 - Case studies in statistics Today s Lecture Schedule 10/26 - Principal components 10/27 - due 11/2 - Analysis of covariance structures 11/9 - Canonical correlation 11/16 - Discrimination/classification 11/23 - Thanksgiving break 11/30 - Clustering/distance methods/mds (Final handed out - Due 12/15) 12/7 - No class Lecture #8-10/12/2005 Slide 3 of 68

4 Single Sample Two Samples Parallel Profiles? Coincident Profiles? Level Profiles? We begin with a definition of profile analysis (aka growth curves or repeated measures) For each subject we have a vector of observations that are commensurate: We assume that the observations are measured in the same units We assume that the observations have about the same variances Typically, these analyses are done with data that has been collected over time and we are interested in seeing whether there is a general linear trend (an ordering is not necessary though) Lecture #8-10/12/2005 Slide 4 of 68

5 Single Sample Single Sample Two Samples Parallel Profiles? Coincident Profiles? Level Profiles? We begin our discussion of profile analyses with a reminder that we performed such an analysis with a single sample: the repeated measures T 2 (example using the dogs data) In this case we a set of p observations with mean vector µ = {µ 1, µ 2,,µ p } Now we want to test the Null hypothesis H o : µ 1 = µ 2 = = µ p If all you knew was ANOVA, this particular hypothesis would be analyzed using an ANOVA, BUT that would assume that the observations are independent across the p variables We know that is not true So how are we going to do this? Lecture #8-10/12/2005 Slide 5 of 68

6 Single Sample Recall from the dogs data that we can rewrite our hypothesis that all means are equal to be phrased as H 0 : µ 1 µ 2 = µ 2 µ 3 = = µ p 1 µ p = 0 Single Sample Two Samples Parallel Profiles? Coincident Profiles? Level Profiles? H 0 : µ 1 µ 2 µ 2 µ 3 µ p 1 µ p = Notice that the only way this can occur is when all means are equal Lecture #8-10/12/2005 Slide 6 of 68

7 Single Sample Single Sample Two Samples Parallel Profiles? Coincident Profiles? Level Profiles? We begin by defining z = CX where C is a ((p 1) x p) contrast matrix defined as: Cy = y = y 1 y 2 y 2 y 3 y p 1 y p Then we compute are new mean vector z = C x and covariance matrix S z = CS x C There are now p 1 rows of z S z is size p 1 p 1 Lecture #8-10/12/2005 Slide 7 of 68

8 Single Sample Single Sample Two Samples Parallel Profiles? Coincident Profiles? Level Profiles? From there we compute Hotelling s T 2 using our old equation and our new variables z Which is the same as T 2 = n( z µ 0 ) (S z ) 1 ( z µ 0 ) T 2 = n(c x µ 0 ) (CS x C ) 1 (C x µ 0 ) T 2 in this case has p 1 variables and n 1 degrees of freedom Lecture #8-10/12/2005 Slide 8 of 68

9 Two Samples Single Sample Two Samples Parallel Profiles? Coincident Profiles? Level Profiles? Now imagine that of having only a single mean vector that we are interested in testing, we have two Previously we only used a test to see if we believed that the means were all equal, well now with two vectors we can test three things 1 Are the two profiles parallel? 1a Is H 01 : µ 1i µ 1i 1 = µ 2i µ 2i 1 (i = 2, 3,,p)? 2 If they are parallel, are they coincident? 2a Is H 02 : µ 1i = µ 2i (i = 1, 2, 3,, p)? 3 If they are coincident, are they level? 3a Is H 03 : µ 1i = µ 12 = = µ 1p = µ 2i = µ 22 = = µ 2p? Lecture #8-10/12/2005 Slide 9 of 68

10 Example Data Set Single Sample Two Samples Parallel Profiles? Coincident Profiles? Level Profiles? To demonstrate profile analysis, we will use the example data set from Johnson and Wichern (p 320): As part of a larger study of love and marriage, E Hatfield, a sociologist, survey adults with respect to their marriage contributions and outcomes and their levels of passionate and companionate love Recently married males and females were asked to respond to the following questions using an 8-point scale (8 representing extremely positive) How would you describe your contributions to the marriage? How would you describe your outcomes from the marriage? Lecture #8-10/12/2005 Slide 10 of 68

11 Example Data Set Continuing on: Single Sample Two Samples Parallel Profiles? Coincident Profiles? Level Profiles? Subjects were also asked to respond to the following questions, using the 5-point scale shown: What is the level of passionate love that you feel for your partner? What is the level of compassionate love that you feel for your partner? Lecture #8-10/12/2005 Slide 11 of 68

12 Example Data Set Single Sample Two Samples Parallel Profiles? Coincident Profiles? Level Profiles? Lecture #8-10/12/2005 Slide 12 of 68

13 Example Data Set Single Sample Two Samples Parallel Profiles? Coincident Profiles? Level Profiles? Lecture #8-10/12/2005 Slide 13 of 68

14 Part 1: Parallel Profiles? Single Sample Two Samples Parallel Profiles? Coincident Profiles? Level Profiles? We begin by asking whether the two population mean vectors are parallel If they are parallel, it could mean that: One group performed uniformly better than the other group Both groups are equal To test this we need to think what it would mean that they are parallel It would mean that the difference between any two means in group one is the same as the difference between the same two means in group two Lecture #8-10/12/2005 Slide 14 of 68

15 Parallel Profiles? Single Sample Two Samples Parallel Profiles? Coincident Profiles? Level Profiles? We have already done something like this in the single profile case In the single case we said that all differences should equal zero (this amounted to a single mean vector test) Now we have mean differences for each group and instead of test whether they equal zero we will test if they are EQUAL the Null hypothesis is: H o : µ 12 µ 11 µ 13 µ 12 µ 1p µ 1(p 1) = µ 22 µ 21 µ 23 µ 22 µ 2p µ 2(p 1) Lecture #8-10/12/2005 Slide 15 of 68

16 Parallel Profiles? Single Sample Two Samples Parallel Profiles? Coincident Profiles? Level Profiles? We compute our regular T 2 using the contrast matrix C, with p 1 contrasts: T 2 = ( x 1 x 2 ) C [( 1 n n 2 ) CS p C ] 1 C( x 1 x 2 ) Which is compared against: (n 1 + n 2 2)(p 1) n 1 + n 2 p F p 1,n1 +n 2 p(α) Lecture #8-10/12/2005 Slide 16 of 68

17 Coincident Profiles? Single Sample Two Samples Parallel Profiles? Coincident Profiles? Level Profiles? Now, especially if we found out that the profiles are parallel, we are going to test whether they are, ON AVERAGE the same height Recall our null hypothesis: Is H 02 : µ 1i = µ 2i (i = 1, 2, 3,, p)? We could also rewrite this as: H 0 : (µ 11 + µ µ 1p ) = (µ 21 + µ µ 2p ) Lecture #8-10/12/2005 Slide 17 of 68

18 Coincident Profiles? To test this hypothesis, we again us T 2 : Single Sample Two Samples Parallel Profiles? Coincident Profiles? Level Profiles? [( 1 T 2 = 1 ( x 1 x 2 ) + 1 ) 1 1 S p 1] 1 ( x 1 x 2 ) n 1 n 2 This statistic is nearly the same as the univariate t-test comparing the overall average across all variables for each group T 2 is compared with: t 2 n 1 +n 2 +2 ( α ) = F 1,n1 +n 2 2 2(α) Lecture #8-10/12/2005 Slide 18 of 68

19 Level Profiles? Single Sample Two Samples Parallel Profiles? Coincident Profiles? Level Profiles? Finally, we can text to see if the two profiles are, on average, flat (ie, this is the same as saying looking for a main effect across variables ON AVERAGE) What is nice about this is that we have already learned how to test for whether a single profile is flat H 0 : µ 1 = µ 2 = = µ p Now we want to phrase this same test in a way so we will test H 0 : 1 2 (µ 11 + µ 21 ) = 1 2 (µ 12 + µ 22 ) = = 1 2 (µ 1p + µ 2p ) We could rewrite this as another hypothesis test H 0 : 1 2 (µ 1 + µ 2 ) = 0 Lecture #8-10/12/2005 Slide 19 of 68

20 Level Profiles? Single Sample Two Samples Parallel Profiles? Coincident Profiles? Level Profiles? So to test this we will use the overall mean and test whether it is flat (this is conceptually the same as pooling our two groups and performing a one vector test) This means we define the overall mean x as x = n 1 x 1 + n 2 x 2 n 1 + n 2 Our estimate of variance will be the pooled variance covariance matrix S p We perform our regular test to see if it is flat T 2 = (n 1 + n 2 ) xc ( CS p C ) 1 C x This is compared against (n 1+n 2 1)(p 1) n 1 +n 2 p+1 F p 1,n1 +n 2 p+1(α) Lecture #8-10/12/2005 Slide 20 of 68

21 Matrix Forms Error Distribution Estimation Sums of Squares b Variance Squared Correlation Mean Prediction CI Mean CI Graph Single Value CI Single CI Graph Recall the multiple regression equation (for the i th observation, prediction of Y i by r variables z ir ): Y j = β 0 + β 1 z j1 + β 2 z j2 + + β r z jr + ǫ j In univariate multiple regression, we assume the following: All n observations are independent E(ǫ j ) = 0 - the error terms have a zero mean Var(ǫ j ) = σ 2 - the error terms have a constant variance Cov(ǫ j, ǫ k ) = 0 - the error terms are uncorrelated between observations Lecture #8-10/12/2005 Slide 21 of 68

22 Analysis with Matrices Matrix Forms Error Distribution Estimation Sums of Squares b Variance Squared Correlation Mean Prediction CI Mean CI Graph Single Value CI Single CI Graph The equation above can be expressed more compactly by a set of matrices: y is of size (n 1) Z is of size (n (r + 1)) β is of size ((r + 1) 1) ǫ is of size (n 1) y = Zβ + ǫ Lecture #8-10/12/2005 Slide 22 of 68

23 with Matrices Y 1 Y 2 Y 3 Y 4 Y n = 1 z 11 z 12 z 1r 1 z 21 z 22 z 2r 1 z 31 z 32 z 3r 1 z 41 z 42 z 4r 1 z n1 z n2 z nr β 0 β 1 β 2 β r + ǫ 1 ǫ 2 ǫ 3 ǫ 4 ǫ n y = Z β + ǫ (n 1) (n (r + 1)) ((r + 1) 1) (n 1) Lecture #8-10/12/2005 Slide 23 of 68

24 Lecture #8-10/12/2005 Slide 24 of 68 with Matrices Working the matrix multiplication and addition for a single case gives: Y 1 Y 2 Y 3 Y 4 Y n = 1 z 11 z 12 z 1r 1 z 21 z 22 z 2r 1 z 31 z 32 z 3r 1 z 41 z 42 z 4r 1 z n1 z n2 z nr β 0 β 1 β 2 β r + ǫ 1 ǫ 2 ǫ 3 ǫ 4 ǫ n Y 1

25 Lecture #8-10/12/2005 Slide 24 of 68 with Matrices Working the matrix multiplication and addition for a single case gives: Y 1 Y 2 Y 3 Y 4 Y n = 1 z 11 z 12 z 1r 1 z 21 z 22 z 2r 1 z 31 z 32 z 3r 1 z 41 z 42 z 4r 1 z n1 z n2 z nr β 0 β 1 β 2 β r + ǫ 1 ǫ 2 ǫ 3 ǫ 4 ǫ n Y 1 = β 0

26 Lecture #8-10/12/2005 Slide 24 of 68 with Matrices Working the matrix multiplication and addition for a single case gives: Y 1 Y 2 Y 3 Y 4 Y n = 1 z 11 z 12 z 1r 1 z 21 z 22 z 2r 1 z 31 z 32 z 3r 1 z 41 z 42 z 4r 1 z n1 z n2 z nr β 0 β 1 β 2 β r + ǫ 1 ǫ 2 ǫ 3 ǫ 4 ǫ n Y 1 = β 0 + β 1 z 11

27 Lecture #8-10/12/2005 Slide 24 of 68 with Matrices Working the matrix multiplication and addition for a single case gives: Y 1 Y 2 Y 3 Y 4 Y n = 1 z 11 z 12 z 1r 1 z 21 z 22 z 2r 1 z 31 z 32 z 3r 1 z 41 z 42 z 4r 1 z n1 z n2 z nr β 0 β 1 β 2 β r + ǫ 1 ǫ 2 ǫ 3 ǫ 4 ǫ n Y 1 = β 0 + β 1 z 11 + β 2 z 12

28 Lecture #8-10/12/2005 Slide 24 of 68 with Matrices Working the matrix multiplication and addition for a single case gives: Y 1 Y 2 Y 3 Y 4 Y n = 1 z 11 z 12 z 1r 1 z 21 z 22 z 2r 1 z 31 z 32 z 3r 1 z 41 z 42 z 4r 1 z n1 z n2 z nr β 0 β 1 β 2 β r + ǫ 1 ǫ 2 ǫ 3 ǫ 4 ǫ n Y 1 = β 0 + β 1 z 11 + β 2 z 12 +

29 Lecture #8-10/12/2005 Slide 24 of 68 with Matrices Working the matrix multiplication and addition for a single case gives: Y 1 Y 2 Y 3 Y 4 Y n = 1 z 11 z 12 z 1r 1 z 21 z 22 z 2r 1 z 31 z 32 z 3r 1 z 41 z 42 z 4r 1 z n1 z n2 z nr β 0 β 1 β 2 β r + ǫ 1 ǫ 2 ǫ 3 ǫ 4 ǫ n Y 1 = β 0 + β 1 z 11 + β 2 z β r z 1r

30 Lecture #8-10/12/2005 Slide 24 of 68 with Matrices Working the matrix multiplication and addition for a single case gives: Y 1 Y 2 Y 3 Y 4 Y n = 1 z 11 z 12 z 1r 1 z 21 z 22 z 2r 1 z 31 z 32 z 3r 1 z 41 z 42 z 4r 1 z n1 z n2 z nr β 0 β 1 β 2 β r + ǫ 1 ǫ 2 ǫ 3 ǫ 4 ǫ n Y 1 = β 0 + β 1 z 11 + β 2 z β r z 1r +e 1

31 Matrix Forms Error Distribution Estimation Sums of Squares b Variance Squared Correlation Mean Prediction CI Mean CI Graph Single Value CI Single CI Graph with Matrices The matrix of predictors, Z, has the first column containing all ones This represents the intercept parameter β 0 This is also an introduction to setting columns of the Z matrix to represent design and or group controls (as in ANOVA) Also note that the assumptions from the univariate regression model can be reexpressed: E(ǫ) = 0 - the error terms have a zero mean vector Cov(ǫ) = E(ǫǫ ) = σ 2 I: The error terms have a constant variance (all equal to σ 2 The error terms are uncorrelated between observations Lecture #8-10/12/2005 Slide 25 of 68

32 Distribution of Errors Matrix Forms Error Distribution Estimation Sums of Squares b Variance Squared Correlation Mean Prediction CI Mean CI Graph Single Value CI Single CI Graph We often place distributional assumptions on our error terms, allowing for the development of tractable hypothesis tests With matrices, the distributional assumptions are no different, except for things are approached in a multivariate fashion: ǫ N n (0, σ 2 I) Having a multivariate normal distribution with uncorrelated variables (from I N ) is identical to saying: for all i observations ǫ j N(0, σ 2 ) Lecture #8-10/12/2005 Slide 26 of 68

33 Estimation with Matrices Matrix Forms Error Distribution Estimation Sums of Squares b Variance Squared Correlation Mean Prediction CI Mean CI Graph Single Value CI Single CI Graph estimates are typically found via least squares (called L 2 estimates) In least squares regression, the estimates are found by minimizing: n ǫ 2 j = j=1 n (Y j Ŷ j ) 2 = j=1 n (Y j β 0 + β 1 z j1 + + β r z jr ) 2 j=1 As you could guess, we could accomplish all of this via matrices Equivalently: n ǫ 2 j = j=1 n (Y j z j β) 2 = (y Zβ) (y Zβ) = ǫ ǫ j=1 Lecture #8-10/12/2005 Slide 27 of 68

34 Estimation with Matrices Thankfully, there are people who figured out the equation for b that minimizes ǫ ǫ b = (Z Z) 1 Z y b = ( Z Z ) 1 Z y ((r + 1) 1) ((r + 1) n) (n (r + 1)) ((r + 1) n) (n 1) This equation is what I have been talking about for quite some time, the General Linear Model For many types of data, in many differing analyses, this equation will provide estimates: Multiple ANOVA Analysis of Covariance (ANCOVA) Multiple with Curvilinear relationships in Z Lecture #8-10/12/2005 Slide 28 of 68

35 Example Matrix Forms Error Distribution Estimation Sums of Squares b Variance Squared Correlation Mean Prediction CI Mean CI Graph Single Value CI Single CI Graph Imagine that you are trying to determine best type of birthday cake to get for someone who s birthday is fast approaching Given your knowledge of the subject, you have identified two variables that seem to play a primary role in the ratings people assign to other cakes: Moisture content Sweetness You are interested in determining how moisture content and sweetness play a role in cake preference ratings You collect a random sample of cakes from locations here in Lawrence (like Target) You collect a random sample of cake eaters here in Lawrence Lecture #8-10/12/2005 Slide 29 of 68

Prediction CI Mean CI Graph Single Value CI Single

36 The Cook Matrix Forms Error Distribution Estimation Sums of Squares b Variance Squared Correlation Mean Prediction CI Mean CI Graph Single Value CI Single CI Graph Chef Bob Lecture #8-10/12/2005 Slide 30 of 68

37 The Cake Matrix Forms Error Distribution Estimation Sums of Squares b Variance Squared Correlation Mean Prediction CI Mean CI Graph Single Value CI Single CI Graph Cake Lecture #8-10/12/2005 Slide 31 of 68

38 The Data Matrix Forms Error Distribution Estimation Sums of Squares b Variance Squared Correlation Mean Prediction CI Mean CI Graph Single Value CI Single CI Graph Rating Moisture Sweetness Y z1 z Lecture #8-10/12/2005 Slide 32 of 68

39 The Code proc glm data=cake; model rating=moisture sweetness; Matrix Forms Error Distribution Estimation Sums of Squares b Variance Squared Correlation Mean Prediction CI Mean CI Graph Single Value CI Single CI Graph Lecture #8-10/12/2005 Slide 33 of 68

40 The Output Matrix Forms Error Distribution Estimation Sums of Squares b Variance Squared Correlation Mean Prediction CI Mean CI Graph Single Value CI Single CI Graph Lecture #8-10/12/2005 Slide 34 of 68

41 Sums of Squares Matrix Forms Error Distribution Estimation Sums of Squares b Variance Squared Correlation Mean Prediction CI Mean CI Graph Single Value CI Single CI Graph The Sum of Squares for the can be obtained by matrix calculations: SS reg = b Z y ( N i=1 Y )2 N = b Z y 1 N (y 1) 2 Lecture #8-10/12/2005 Slide 35 of 68

42 Residual Sums of Squares Matrix Forms Error Distribution Estimation Sums of Squares b Variance Squared Correlation Mean Prediction CI Mean CI Graph Single Value CI Single CI Graph Hypothesis tests in multiple regression proceed similar to those in ANOVA in that a single, omnibus test is performed prior to inspecting for significance of each of the model parameters The hypothesis test is constructed by comparing the sum of squares due to regression (divided by the number of variables in the model) to the sum of squares due to error (divided by the number of observations minus the number of variables minus one) Lecture #8-10/12/2005 Slide 36 of 68

43 Residual Sums of Squares Matrix Forms Error Distribution Estimation Sums of Squares b Variance Squared Correlation Mean Prediction CI Mean CI Graph Single Value CI Single CI Graph The Residual Sum of Squares is also obtained by matrix calculations: SS res = n e 2 j = e e = y y b Z y j=1 Lecture #8-10/12/2005 Slide 37 of 68

44 Omnibus Hypothesis Test The overall hypothesis test, testing: Matrix Forms Error Distribution Estimation Sums of Squares b Variance Squared Correlation Mean Prediction CI Mean CI Graph Single Value CI Single CI Graph is given by: F = H 0 : β = 0 SS reg /r SS res /(n r 1) This is compared with F r,n r 1 (α) If this is rejected, inspection of the model parameters ensues, testing each for being equal to zero Lecture #8-10/12/2005 Slide 38 of 68

45 Variance of Estimators Matrix Forms Error Distribution Estimation Sums of Squares b Variance Squared Correlation Mean Prediction CI Mean CI Graph Single Value CI Single CI Graph The covariance matrix of the regression parameters contains useful information regarding the standard errors of the estimates (found along the diagonal) To find the covariance matrix of the estimates, the Residual Sum of Squares is needed (dividing this term by the residual degrees of freedom gives the error variance, or σ 2 e): Cov(b) = σ 2 e(z Z) 1 = SS res n r 1 (Z Z) 1 = 1 n r 1 e e(z Z) 1 Lecture #8-10/12/2005 Slide 39 of 68

46 Squared Correlation Matrix Forms Error Distribution Estimation Sums of Squares b Variance Squared Correlation Mean Prediction CI Mean CI Graph Single Value CI Single CI Graph The Squared Multiple Correlation Coefficient, or R 2, is found from: R 2 = SS reg n j=1 (Y j Ȳ )2 This is also obtainable by matrix calculations: R 2 = b Z y 1 n (y 1) 2 y y 1 n (y 1) 2 Lecture #8-10/12/2005 Slide 40 of 68

47 Prediction Matrix Forms Error Distribution Estimation Sums of Squares b Variance Squared Correlation Mean Prediction CI Mean CI Graph Single Value CI Single CI Graph To make things easy, let s begin with trying to predict a cake s rating based on a single z variable: moisture content From the results of the analysis, for any size of X (any living space size), we will predict the sale price of the house to be: Ŷ = z But knowing the predicted value is just the beginning Lecture #8-10/12/2005 Slide 41 of 68

48 What Does the Predicted Value Represent? Matrix Forms Error Distribution Estimation Sums of Squares b Variance Squared Correlation Mean Prediction CI Mean CI Graph Single Value CI Single CI Graph The predicted value given by a regression equation is the expected value of Y conditional on z This means that for every value of z, the mean value [E(Y z)] is given by Ŷ Recall from basic statistics that if a variable x is distributed with a mean µ x and variance σ 2 x then: As sample size, n, goes to infinity, the distribution of x is N(µ x, σ2 x n ) (Central Limit Theorem) Using the CLT, we are able to build a confidence bound around our sample mean so that we can be 100(α/2)% certain that the true µ x lies in the interval The same principal can be applied to of the distribution of Y z Ŷ, which is the mean Lecture #8-10/12/2005 Slide 42 of 68

49 Confidence Interval for Mean Predicted Value Matrix Forms Error Distribution Estimation Sums of Squares b Variance Squared Correlation Mean Prediction CI Mean CI Graph Single Value CI Single CI Graph A confidence interval for the mean predicted value of a regression is given by: ( α ) z 0b ± t n r 1 z 0 2 (Z Z) 1 z 0 s 2 where z 0 is the set of values of z used to predict Y from As the value of z 0 gets farther from its mean, the prediction interval increases in range You can get mean prediction values form SAS with the following code: *SAS Example #2- mean prediction; proc glm data=cake; model rating=moisture/ p clm; r Lecture #8-10/12/2005 Slide 43 of 68

50 Predicted Value Example Matrix Forms Error Distribution Estimation Sums of Squares b Variance Squared Correlation Mean Prediction CI Mean CI Graph Single Value CI Single CI Graph Lecture #8-10/12/2005 Slide 44 of 68

51 Graph of the Mean Predicted Value Interval Matrix Forms Error Distribution Estimation Sums of Squares b Variance Squared Correlation Mean Prediction CI Mean CI Graph Single Value CI Single CI Graph Lecture #8-10/12/2005 Slide 45 of 68

52 CI for Single Predicted Values Often times, the mean value is not of interest in prediction Matrix Forms Error Distribution Estimation Sums of Squares b Variance Squared Correlation Mean Prediction CI Mean CI Graph Single Value CI Single CI Graph Imagine you buy a cake with a moisture rating of 6 You want to know how well that cake will be liked Furthermore, you want to know what range of ratings you should expect from this cake Clearly, in your case, the CI for the mean isn t very helpful The CI for a single predicted value would be z 0b ± t n r 1 ( α 2 ) (1 + z 0 (Z Z) 1 z 0 )s 2 Lecture #8-10/12/2005 Slide 46 of 68

53 SE of a Single Predicted Value This formula is very similar to that for the mean Matrix Forms Error Distribution Estimation Sums of Squares b Variance Squared Correlation Mean Prediction CI Mean CI Graph Single Value CI Single CI Graph Again, because of (z i z) 2, as z gets farther from its mean, the CI gets wider Ŷ = z = (6) = In SAS: *SAS Example #3- individual prediction; proc glm data=cake; model rating=moisture/ p cli; r Lecture #8-10/12/2005 Slide 47 of 68

54 Predicted Value Example Matrix Forms Error Distribution Estimation Sums of Squares b Variance Squared Correlation Mean Prediction CI Mean CI Graph Single Value CI Single CI Graph Lecture #8-10/12/2005 Slide 48 of 68

55 Graph of the Mean Predicted Value Interval Matrix Forms Error Distribution Estimation Sums of Squares b Variance Squared Correlation Mean Prediction CI Mean CI Graph Single Value CI Single CI Graph Lecture #8-10/12/2005 Slide 49 of 68

56 Graph of the Both Intervals Matrix Forms Error Distribution Estimation Sums of Squares b Variance Squared Correlation Mean Prediction CI Mean CI Graph Single Value CI Single CI Graph Lecture #8-10/12/2005 Slide 50 of 68

57 Hypothesis Tests Hypothesis Tests Predictions Everything about univariate regression can be generalized to multivariate regression through use of matrices Instead of predicting a single column vector y, we now predict a matrix of observations Y Y is of size (n m) Z is of size (n (r + 1)) β is of size ((r + 1) m) Y = Zβ + ǫ ǫ is of size (n m) Lecture #8-10/12/2005 Slide 51 of 68

58 Lecture #8-10/12/2005 Slide 52 of 68 with Matrices Y 11 Y 12 Y 1m Y 21 Y 22 Y 2m Y 31 Y 32 Y 3m Y 41 Y 42 Y 4m Y n1 Y n2 Y nm 1 z 11 z 12 z 1r 1 z 21 z 22 z 2r 1 z 31 z 32 z 3r 1 z 41 z 42 z 4r 1 z n1 z n2 z nr Y Z (n m) (n (r + 1))

59 with Matrices β 01 β 02 β 0m β 11 β 12 β 1m β 21 β 22 β 2m β r1 β r2 β rm ǫ 11 ǫ 12 ǫ 1m ǫ 21 ǫ 22 ǫ 2m ǫ 31 ǫ 32 ǫ 3m ǫ 41 ǫ 42 ǫ 4m ǫ n1 ǫ n2 ǫ nm β ǫ ((r + 1) m) (n m) Lecture #8-10/12/2005 Slide 53 of 68

60 Estimation with Matrices Just as with univariate regression, the least squares estimates are found by: B = (Z Z) 1 Z Y B = ( Z Z ) 1 Z Y ((r + 1) m) ((r + 1) n) (n (r + 1)) ((r + 1) n) (n m) Every other formula used for linear regression can be extended in a similar fashion Lecture #8-10/12/2005 Slide 54 of 68

61 Hypothesis Tests Hypothesis Tests Hypothesis Tests Predictions Like MANOVA, the multivariate regression model can be tested for an overall indication that some regression coefficient is equal to zero (one out of the many within the β matrix) Again, the hypothesis test information can be found from within the MANOVA output from SAS To introduce this, take SAS example #4from problem 725 of Johnson and Wichern (p 423): Amitriptyline is prescribed by some physicians as an antidepressant However, there are also conjectured side effects that seem to be related to the use of the drug: irregular heartbeat, abnormal blood pressures, and irregular waves on the electrocardiogram, among other things Data are gathered on 17 patients who were admitted to the hospital after an amitriptyline overdose Lecture #8-10/12/2005 Slide 55 of 68

62 Hypothesis Tests Variables: Hypothesis Tests Hypothesis Tests Predictions Y 1 : TOT - Total TCAD Plasma level Y 2 : AMI - Amount of amitriptyline present in TCAD plasma level (AMI) Z 1 : GEN - Gender: 1 if female, 0 if male Z 2 : AMT - Amount of antidepressants taken at time of overdose Z 3 : PR - PR wave measurement Z 4 : DIAP - Diastolic blood pressure Z 5 : QRS - QRS wave measurement Lecture #8-10/12/2005 Slide 56 of 68

63 Hypothesis Tests The SAS code for multivariate regression is: Hypothesis Tests Hypothesis Tests Predictions data battery; infile C:\T7-6dat ; input tot ami gen amt pr diap qrs; run; proc glm data=battery; model tot ami=gen amt pr diap qrs; manova h=gen amt pr diap qrs/printe; run; Lecture #8-10/12/2005 Slide 57 of 68

64 SAS Output Hypothesis Tests Hypothesis Tests Predictions H 0 : β 5 = 0 Lecture #8-10/12/2005 Slide 58 of 68

65 SAS Output Hypothesis Tests Hypothesis Tests Predictions Lecture #8-10/12/2005 Slide 59 of 68

66 Predictions Hypothesis Tests Hypothesis Tests Predictions Just as with univariate regression, two types of predictions can be made for multivariate regression: Predictions for the mean response vector Predictions for a single response Now, these predictions are in the form of a (you guessed it)confidence ellipse These ellipses can then be projected to be simultaneous confidence intervals Lecture #8-10/12/2005 Slide 60 of 68

67 Predictions for the Mean Response Hypothesis Tests Hypothesis Tests Predictions For a vector of observations z 0, the 100 (1 α)% simultaneous confidence intervals for the mean prediction are found by: z 0 B (i) ± ( ) m(n r 1) z 0 n r m (Z Z) 1 n z 0 n r 1 ˆσ ii Here,ˆσ ii is the i th diagonal element of ˆΣ, where: ˆΣ = 1 n (Y ZB) (Y ZB) Lecture #8-10/12/2005 Slide 61 of 68

68 Predictions for the Individual Response Hypothesis Tests Hypothesis Tests Predictions For a vector of observations z 0, the 100 (1 α)% simultaneous confidence intervals for the individual prediction are found by: z 0 B (i) ± ( ) m(n r 1) (1 + z 0 n r m (Z Z) 1 n z 0 ) n r 1 ˆσ ii Here,ˆσ ii is the i th diagonal element of ˆΣ, where: ˆΣ = 1 n (Y ZB) (Y ZB) Lecture #8-10/12/2005 Slide 62 of 68

69 Final Exam Two problems General Info Data Analysis #1 Data Analysis #2 - Asthma 1 Crash test data 2 Asthma data For both please: Double-space Have less than 10 pages of text per problem Lecture #8-10/12/2005 Slide 63 of 68

70 Data Analysis #1 General Info Data Analysis #1 Data Analysis #2 - Asthma Crash test dummy data Lecture #8-10/12/2005 Slide 64 of 68

71 Data Analysis #2 - Asthma General Info Data Analysis #1 Data Analysis #2 - Asthma Lecture #8-10/12/2005 Slide 65 of 68

72 Data Analysis #2 - Air Pollution General Info Data Analysis #1 Data Analysis #2 - Asthma Lecture #8-10/12/2005 Slide 66 of 68

73 Final Thought Profile analysis is helpful in determining the nature of differences between two groups regression is very similar to MANOVA in its application A hint for the test: use of IML is not necessary Final Thought Next Class Lecture #8-10/12/2005 Slide 67 of 68

74 Next Time Statistical Case Studies Reporting multivariate statistical analyses A SAS Macro for testing for multivariate normality I bring in some data sets and we analyze them together If you have data you would like analyzed (featuring topics we have used up to this point), send them to me via Final Thought Next Class Lecture #8-10/12/2005 Slide 68 of 68

Multivariate Linear Regression Models

Multivariate Linear Regression Models Regression analysis is used to predict the value of one or more responses from a set of predictors. It can also be used to estimate the linear association between