Applied Multivariate Analysis
|
|
- Kelley Flowers
- 5 years ago
- Views:
Transcription
1 Department of Mathematics and Statistics, University of Vaasa, Finland Spring 2017
2 Discriminant Analysis
3 Background 1 Discriminant analysis Background General Setup for the Discriminant Analysis Descriptive Discriminant Analysis Number of Discriminant Functions
4 Background Example 1 Consider the following data on financial ratios for solvent and bankrupted companies Financial Ratios of Bankrupt and Solvent Companies, Altman (1968) Source: Morrison (1990). Multivariate Statistical Methods, 3rd ed. McGraw-Hill X1 = Working Capital / Total Assets X2 = Retained Earnings / Total Assets X3 = Earnings Before Interest and Taxes / Total Assets X4 = Market Value of Equity / Total Value of Liabilities X5 = Sales / Total Assets Group, 1 = Bankrupt 2 = Solvent
5 Background Group X1 X2 X3 X4 X5 Group X1 X2 X3 X4 X Seppo 7.0 Pynnönen 0.9
6 Background
7 Background Relevant questions then are: How do the companies in these two groups differ from each other? Which ratios best discriminate the groups? Are the ratios useful for predicting bankruptcies? Partial answers to can be obtained by examining each single variable at a time.
8 Background For example sample statistics for each group are
9 Background Some graphics may also be helpful. For example, More complete use of group separation information, however, can be given by discriminant analysis (DA).
10 General Setup for the Discriminant Analysis 1 Discriminant analysis Background General Setup for the Discriminant Analysis Descriptive Discriminant Analysis Number of Discriminant Functions
11 General Setup for the Discriminant Analysis Discriminant analysis is used for two purposes: (1) describing major differences among the groups, and (2) classifying subject on the basis of measurements.
12 Descriptive Discriminant Analysis 1 Discriminant analysis Background General Setup for the Discriminant Analysis Descriptive Discriminant Analysis Number of Discriminant Functions
13 Descriptive Discriminant Analysis The start off setup: p variables q exclusive groups
14 Descriptive Discriminant Analysis The goal of the descriptive DA is: Form k new variables such that 1 The new variables are uncorrelated. 2 The first new variable has the best discriminating power w.r.t the given groups. The second new variable has the second best discriminating power and is uncorrelated with the first one, the third has the third best discriminating power and is uncorrelated with the previous ones, etc. Remark 1 k min(p, q 1). For example, if q = 2 then k = min(p, 1) = 1.
15 Descriptive Discriminant Analysis More precisely, suppose we have observations on random variables x 1,..., x p from q groups. Then the j th discriminant function is defined as a linear combination of the original variables y j = a j1 x a jp x p, (1) such that corr[y j, y l ] = 0 for j l, and y 1 has the best discriminating power, y 2 the second best, and so on.
16 Descriptive Discriminant Analysis Remark 2 In the basic case the assumption is that the groups differ only with respect to the means of the variables. As a consequence the correlations between the variables and variances are assumed the same over the groups (groups have similar covariance structures).
17 Descriptive Discriminant Analysis The idea in deriving the discriminant functions is to divide the total variation into between group and within group variation T = B + W, (2) where T denotes the total covariance matrix, B the between covariance matrix, and W the within covariance matrix.
18 Descriptive Discriminant Analysis Technically the problem reduces again to an eigenvalue problem. In this case the eigenvalues are extracted form the matrix BW 1. (3) The resulting eigenvectors form the coefficients for the discriminant functions y j, j = 1,..., k with k = min(q 1, p). The functions are called canonical discriminant functions.
19 Descriptive Discriminant Analysis Example 2 Consider the bankruptcy data. SAS proc candisc or SPSS (Analyze Classify Discriminant). Below are SAS results. Example: Discriminant analysis applied to bankrupt data Canonical Discriminant Analysis 66 Observations 65 DF Total 5 Variables 64 DF Within Classes 2 Classes 1 DF Between Classes Class Level Information GROUP Frequency Weight Proportion
20 Descriptive Discriminant Analysis Canonical Discriminant Analysis Within-Class Covariance Matrices GROUP = 1 DF = 32 Variable X1 X2 X3 X4 X5 X X X X X GROUP = 2 DF = 32 Variable X1 X2 X3 X4 X5 X X X X X
21 Descriptive Discriminant Analysis Canonical Discriminant Analysis Simple Statistics Total-Sample Variable N Mean Variance Std Dev X X X X X GROUP = 1 Variable N Mean Variance Std Dev X X X X X GROUP = 2 Variable N Mean Variance Std Dev X X X X X
22 Descriptive Discriminant Analysis Univariate Test Statistics F Statistics, Num DF= 1 Den DF= 64 Total Pooled Between RSQ/ Variable STD STD STD R-Squared (1-RSQ) X X X X X Univariate Test Statistics Variable F Pr > F X X X X X Average R-Squared: Unweighted = Weighted by Variance = Multivariate Statistics and Exact F Statistics S=1 M=1.5 N=29 Statistic Value F Num DF Den DF Pr > F Wilks Lambda Pillai s Trace Hotelling-Lawley Trace Roy s Greatest Root
23 Descriptive Discriminant Analysis Example: Discriminant analysis applied to bankrupt data Canonical Discriminant Analysis Adjusted Approx Squared Canonical Canonical Standard Canonical Correlation Correlation Error Correlation Eigenvalues of INV(E)*H = CanRsq/(1-CanRsq) Eigenvalue Difference Proportion Cumulative Test of H0: The canonical correlations in the current row and all that follow are zero Likelihood Ratio Approx F Num DF Den DF Pr > F NOTE: The F statistic is exact. Total Canonical Structure CAN1 X X X X X
24 Descriptive Discriminant Analysis Between Canonical Structure CAN1 X X X X X Pooled Within Canonical Structure CAN1 X X X X X Total-Sample Standardized Canonical Coefficients CAN1 X X X X X Pooled Within-Class Standardized Canonical Coefficients CAN1 X X X X X
25 Descriptive Discriminant Analysis Raw Canonical Coefficients CAN1 X X X X X Class Means on Canonical Variables GROUP CAN
26 Descriptive Discriminant Analysis The output includes several coefficient matrices. The structure matrices describe the correlations of the original variables with the discriminant function. The most useful of these for interpretation purposes is the within canonical structure. In the case of multiple groups also between canonical structure may give useful additional information. This structure tells how the means of variables and means of discriminant functions are correlated.
27 Descriptive Discriminant Analysis The standardized coefficients are obtained by dividing the raw coefficients by the standard deviations of the variables. These coefficient tell the marginal effect of the (standardized) variable on the discriminant function. Labeling the discriminant function is based on those variables having largest correlations and largest standardized coefficients.
28 Descriptive Discriminant Analysis Example 3 From the within canonical structure we observe: X 2 (Retained earnings / Total assets) has the highest correlation with the discriminant function. X 4 (Market value of equity / Total Value of Liabilities), X 1 (Working capital / Total Assets), and X 3 (Earnings before interest and taxes / Total assets) have next highest. X 5 (Sales / Total Assets) is small, but it has a large standardized coefficient. Summing up, profitable and companies whose market value is on a high level are the properties preventing from the bankruptcy.
29 Descriptive Discriminant Analysis It should be noted that the basic assumption in the discriminant analysis is that the variables are normally distributed in each of the groups, and that the covariance matrices are the same. The former assumption is harder to test. The latter is easier (in SPSS select Box M from the options). If the covariance matrices are not the same the linear discriminant function analysis is invalid. One should move to the quadratic discriminant function analysis. This method, however, is planned for classification purposes.
30 Descriptive Discriminant Analysis Example 4 Testing for the equality of the population covariance matrices. H 0 : Σ 1 = Σ 2, (4) where Σ i is the population covariance matrix of the population i (i = 1, 2). SPSS give the result: Test Chi-Square Value = with 15 degrees of freedom and p-value = We observe that the null hypothesis is rejected, hence one analysis results should be interpreted with caution.
31 Number of Discriminant Functions 1 Discriminant analysis Background General Setup for the Discriminant Analysis Descriptive Discriminant Analysis Number of Discriminant Functions
32 Number of Discriminant Functions In a case of multiple group (> 2) the question is: in how many dimension the groups are different. In the case of two groups this is not a major problem, because the groups can differentiate only in one dimension. Generally, however, there can be more discriminating dimensions, if q > 2.
33 Number of Discriminant Functions Example 5 The following data is a classic example considering different species of Iris Setosa. The following measures were made: SL: SW: PL: PW: Sepal length Sepal WIdth Pedal Length Pedal Width
34 Number of Discriminant Functions The CANDISC procedure produces the following results. title; data iris; title Discriminant Analysis of Fisher (1936) Iris Data ; input sepallen sepalwid petallen petalwid if spec_no=1 then species= SETOSA ; if spec_no=2 then species= VERSICOLOR ; if spec_no=3 then species= VIRGINICA ; label sepallen= Sepal Length in mm. sepalwid= Sepal Width in mm. petallen= Petal Length in mm. petalwid= Petal Width in mm. ; datalines;
35 Number of Discriminant Functions title Canonical Discriminant Analysis of IRIS data ; proc candisc data = iris; class species; var sepallen--petalwid; run; Which gives the results: Canonical Discriminant Analysis of IRIS data Canonical Discriminant Analysis 150 Observations 149 DF Total 4 Variables 147 DF Within Classes 3 Classes 2 DF Between Classes Class Level Information SPECIES Frequency Weight Proportion SETOSA VERSICOLOR VIRGINICA Canonical Discriminant Analysis Multivariate Statistics and F Approximations S=2 M=0.5 N=71 Statistic Value F Num DF Den DF Pr > F Wilks Lambda Pillai s Trace Hotelling-Lawley Trace Roy s Greatest Root NOTE: F Statistic for Roy s Greatest Root is an upper bound. NOTE: F Statistic for Wilks Lambda is exact.
36 Number of Discriminant Functions Adjusted Approx Squared Canonical Canonical Standard Canonical Correlation Correlation Error Correlation Eigenvalues of INV(E)*H = CanRsq/(1-CanRsq) Eigenvalue Difference Proportion Cumulative Test of H0: The canonical correlations in the current row and all that follow are zero Likelihood Ratio Approx F Num DF Den DF Pr > F Total Canonical Structure CAN1 CAN2 SEPALLEN Sepal Length in mm. SEPALWID Sepal Width in mm. PETALLEN Petal Length in mm. PETALWID Petal Width in mm.
37 Number of Discriminant Functions Between Canonical Structure CAN1 CAN2 SEPALLEN Sepal Length in mm. SEPALWID Sepal Width in mm. PETALLEN Petal Length in mm. PETALWID Petal Width in mm. Pooled Within Canonical Structure CAN1 CAN2 SEPALLEN Sepal Length in mm. SEPALWID Sepal Width in mm. PETALLEN Petal Length in mm. PETALWID Petal Width in mm.
38 Number of Discriminant Functions Total-Sample Standardized Canonical Coefficients CAN1 CAN2 SEPALLEN Sepal Length in mm. SEPALWID Sepal Width in mm. PETALLEN Petal Length in mm. PETALWID Petal Width in mm. Pooled Within-Class Standardized Canonical Coefficients CAN1 CAN2 SEPALLEN Sepal Length in mm. SEPALWID Sepal Width in mm. PETALLEN Petal Length in mm. PETALWID Petal Width in mm. Raw Canonical Coefficients CAN1 CAN2 SEPALLEN Sepal Length in mm. SEPALWID Sepal Width in mm. PETALLEN Petal Length in mm. PETALWID Petal Width in mm. Class Means on Canonical Variables SPECIES CAN1 CAN2 SETOSA VERSICOLOR VIRGINICA
39 Number of Discriminant Functions The Wilk s lambda test indicates that there are two statistically significant discriminators on the five percent level. Generally the hypotheses to be tested is like in the factor analysis H 0 : H 1 : The number of discriminators = m More is needed (5) On the basis of the within-matrices the first discriminator indicates that the species differ with respect to the overall size of the leaves and the second discriminator that species differ also with respect to the width of the leaves.
40 Number of Discriminant Functions Example 9.6: Bankruptcy risk and signal to reorganization of a company (Laitinen, Luoma, Pynnönen 1996, UV, Discussion Papers 200) Thus we have four groups.
41 Number of Discriminant Functions Sample Table statistics: 7. Descriptive statistics of groups for estimation data. B 1 (n=20) B 2 (n=20) N 3 (n=17) N 4 (n=23) F for eq Variable Mean Std Dev Mean Std Dev Mean Std Dev Mean Std Dev of means ROI *** TCF *** QRA ** SCA *** DSR *** **=significant at level 0.01 ***=significant at level 0.001
42 Number of Discriminant Functions Number of canonical discriminant functions: The results indicate that also the third canonical discriminant function is statistically significant.
43 Number of Discriminant Functions Canonical structure and standardized coefficients: Table 11. Canonical structure and Standardized canonical coefficients both as pooled within. Canonical structure* Standardized coefficient Variable CAN1 CAN2 CAN3 CAN1 CAN2 CAN3 ROI TCF QRA SCA DSR *Correlation coefficients between original variables and canonical variables.
44 Number of Discriminant Functions Interpretation of the discriminant functions:
45 Number of Discriminant Functions Group differences:
46 Number of Discriminant Functions CAN1, the financial performance, shows that the financial performance is the main characteristic differentiating healthy and bankruptcy firms (as expected). CAN2, controversy dynamic liquidity and static ratios, is differentiating characteristic between reorganizable non-bankrupt and reorganizable bankrupt firms. CAN3, controversy between liquidity and other ratios, reorganizable non-bankrupt firms and healthy firms. The distinction is probably due to the fact that non-bankrupt firms may have cash reserves (high liquidity), but do not use it profitably.
The SAS System 18:28 Saturday, March 10, Plot of Canonical Variables Identified by Cluster
The SAS System 18:28 Saturday, March 10, 2018 1 The FASTCLUS Procedure Replace=FULL Radius=0 Maxclusters=2 Maxiter=10 Converge=0.02 Initial Seeds Cluster SepalLength SepalWidth PetalLength PetalWidth 1
More informationMultivariate analysis of variance and covariance
Introduction Multivariate analysis of variance and covariance Univariate ANOVA: have observations from several groups, numerical dependent variable. Ask whether dependent variable has same mean for each
More information4.1 Computing section Example: Bivariate measurements on plants Post hoc analysis... 7
Master of Applied Statistics ST116: Chemometrics and Multivariate Statistical data Analysis Per Bruun Brockhoff Module 4: Computing 4.1 Computing section.................................. 1 4.1.1 Example:
More informationChapter 7, continued: MANOVA
Chapter 7, continued: MANOVA The Multivariate Analysis of Variance (MANOVA) technique extends Hotelling T 2 test that compares two mean vectors to the setting in which there are m 2 groups. We wish to
More informationz = β βσβ Statistical Analysis of MV Data Example : µ=0 (Σ known) consider Y = β X~ N 1 (β µ, β Σβ) test statistic for H 0β is
Example X~N p (µ,σ); H 0 : µ=0 (Σ known) consider Y = β X~ N 1 (β µ, β Σβ) H 0β : β µ = 0 test statistic for H 0β is y z = β βσβ /n And reject H 0β if z β > c [suitable critical value] 301 Reject H 0 if
More informationDiscriminant Analysis
Chapter 16 Discriminant Analysis A researcher collected data on two external features for two (known) sub-species of an insect. She can use discriminant analysis to find linear combinations of the features
More informationDiscriminant Analysis (DA)
Discriminant Analysis (DA) Involves two main goals: 1) Separation/discrimination: Descriptive Discriminant Analysis (DDA) 2) Classification/allocation: Predictive Discriminant Analysis (PDA) In DDA Classification
More informationRepeated Measures Part 2: Cartoon data
Repeated Measures Part 2: Cartoon data /*********************** cartoonglm.sas ******************/ options linesize=79 noovp formdlim='_'; title 'Cartoon Data: STA442/1008 F 2005'; proc format; /* value
More informationMANOVA MANOVA,$/,,# ANOVA ##$%'*!# 1. $!;' *$,$!;' (''
14 3! "#!$%# $# $&'('$)!! (Analysis of Variance : ANOVA) *& & "#!# +, ANOVA -& $ $ (+,$ ''$) *$#'$)!!#! (Multivariate Analysis of Variance : MANOVA).*& ANOVA *+,'$)$/*! $#/#-, $(,!0'%1)!', #($!#$ # *&,
More informationExst7037 Multivariate Analysis Cancorr interpretation Page 1
Exst7037 Multivariate Analysis Cancorr interpretation Page 1 1 *** C03S3D1 ***; 2 ****************************************************************************; 3 *** The data set insulin shows data from
More informationStevens 2. Aufl. S Multivariate Tests c
Stevens 2. Aufl. S. 200 General Linear Model Between-Subjects Factors 1,00 2,00 3,00 N 11 11 11 Effect a. Exact statistic Pillai's Trace Wilks' Lambda Hotelling's Trace Roy's Largest Root Pillai's Trace
More informationApplication of Ghosh, Grizzle and Sen s Nonparametric Methods in. Longitudinal Studies Using SAS PROC GLM
Application of Ghosh, Grizzle and Sen s Nonparametric Methods in Longitudinal Studies Using SAS PROC GLM Chan Zeng and Gary O. Zerbe Department of Preventive Medicine and Biometrics University of Colorado
More informationMULTIVARIATE HOMEWORK #5
MULTIVARIATE HOMEWORK #5 Fisher s dataset on differentiating species of Iris based on measurements on four morphological characters (i.e. sepal length, sepal width, petal length, and petal width) was subjected
More informationMANOVA is an extension of the univariate ANOVA as it involves more than one Dependent Variable (DV). The following are assumptions for using MANOVA:
MULTIVARIATE ANALYSIS OF VARIANCE MANOVA is an extension of the univariate ANOVA as it involves more than one Dependent Variable (DV). The following are assumptions for using MANOVA: 1. Cell sizes : o
More informationSAS/STAT 15.1 User s Guide The CANDISC Procedure
SAS/STAT 15.1 User s Guide The CANDISC Procedure This document is an individual chapter from SAS/STAT 15.1 User s Guide. The correct bibliographic citation for this manual is as follows: SAS Institute
More informationANOVA Longitudinal Models for the Practice Effects Data: via GLM
Psyc 943 Lecture 25 page 1 ANOVA Longitudinal Models for the Practice Effects Data: via GLM Model 1. Saturated Means Model for Session, E-only Variances Model (BP) Variances Model: NO correlation, EQUAL
More informationOther hypotheses of interest (cont d)
Other hypotheses of interest (cont d) In addition to the simple null hypothesis of no treatment effects, we might wish to test other hypothesis of the general form (examples follow): H 0 : C k g β g p
More informationExample 1 describes the results from analyzing these data for three groups and two variables contained in test file manova1.tf3.
Simfit Tutorials and worked examples for simulation, curve fitting, statistical analysis, and plotting. http://www.simfit.org.uk MANOVA examples From the main SimFIT menu choose [Statistcs], [Multivariate],
More informationChapter 9. Multivariate and Within-cases Analysis. 9.1 Multivariate Analysis of Variance
Chapter 9 Multivariate and Within-cases Analysis 9.1 Multivariate Analysis of Variance Multivariate means more than one response variable at once. Why do it? Primarily because if you do parallel analyses
More informationCovariance Structure Approach to Within-Cases
Covariance Structure Approach to Within-Cases Remember how the data file grapefruit1.data looks: Store sales1 sales2 sales3 1 62.1 61.3 60.8 2 58.2 57.9 55.1 3 51.6 49.2 46.2 4 53.7 51.5 48.3 5 61.4 58.7
More informationRepeated-Measures ANOVA in SPSS Correct data formatting for a repeated-measures ANOVA in SPSS involves having a single line of data for each
Repeated-Measures ANOVA in SPSS Correct data formatting for a repeated-measures ANOVA in SPSS involves having a single line of data for each participant, with the repeated measures entered as separate
More informationRepeated Measures ANOVA Multivariate ANOVA and Their Relationship to Linear Mixed Models
Repeated Measures ANOVA Multivariate ANOVA and Their Relationship to Linear Mixed Models EPSY 905: Multivariate Analysis Spring 2016 Lecture #12 April 20, 2016 EPSY 905: RM ANOVA, MANOVA, and Mixed Models
More informationMultivariate Statistical Analysis
Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 17 for Applied Multivariate Analysis Outline Multivariate Analysis of Variance 1 Multivariate Analysis of Variance The hypotheses:
More informationDiscriminant Analysis
Discriminant Analysis V.Čekanavičius, G.Murauskas 1 Discriminant analysis one categorical variable depends on one or more normaly distributed variables. Can be used for forecasting. V.Čekanavičius, G.Murauskas
More informationApplied Multivariate and Longitudinal Data Analysis
Applied Multivariate and Longitudinal Data Analysis Chapter 2: Inference about the mean vector(s) Ana-Maria Staicu SAS Hall 5220; 919-515-0644; astaicu@ncsu.edu 1 In this chapter we will discuss inference
More informationISyE 6416: Computational Statistics Spring Lecture 5: Discriminant analysis and classification
ISyE 6416: Computational Statistics Spring 2017 Lecture 5: Discriminant analysis and classification Prof. Yao Xie H. Milton Stewart School of Industrial and Systems Engineering Georgia Institute of Technology
More informationMultivariate Analysis of Variance
Chapter 15 Multivariate Analysis of Variance Jolicouer and Mosimann studied the relationship between the size and shape of painted turtles. The table below gives the length, width, and height (all in mm)
More informationI L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN
Canonical Edps/Soc 584 and Psych 594 Applied Multivariate Statistics Carolyn J. Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Canonical Slide
More informationWITHIN-PARTICIPANT EXPERIMENTAL DESIGNS
1 WITHIN-PARTICIPANT EXPERIMENTAL DESIGNS I. Single-factor designs: the model is: yij i j ij ij where: yij score for person j under treatment level i (i = 1,..., I; j = 1,..., n) overall mean βi treatment
More informationAnalysis of Longitudinal Data: Comparison Between PROC GLM and PROC MIXED. Maribeth Johnson Medical College of Georgia Augusta, GA
Analysis of Longitudinal Data: Comparison Between PROC GLM and PROC MIXED Maribeth Johnson Medical College of Georgia Augusta, GA Overview Introduction to longitudinal data Describe the data for examples
More informationApplied Multivariate Statistical Modeling Prof. J. Maiti Department of Industrial Engineering and Management Indian Institute of Technology, Kharagpur
Applied Multivariate Statistical Modeling Prof. J. Maiti Department of Industrial Engineering and Management Indian Institute of Technology, Kharagpur Lecture - 29 Multivariate Linear Regression- Model
More informationGeneral Linear Model. Notes Output Created Comments Input. 19-Dec :09:44
GET ILE='G:\lare\Data\Accuracy_Mixed.sav'. DATASET NAME DataSet WINDOW=RONT. GLM Jigsaw Decision BY CMCTools /WSACTOR= Polynomial /METHOD=SSTYPE(3) /PLOT=PROILE(CMCTools*) /EMMEANS=TABLES(CMCTools) COMPARE
More informationGLM Repeated Measures
GLM Repeated Measures Notation The GLM (general linear model) procedure provides analysis of variance when the same measurement or measurements are made several times on each subject or case (repeated
More informationData Analyses in Multivariate Regression Chii-Dean Joey Lin, SDSU, San Diego, CA
Data Analyses in Multivariate Regression Chii-Dean Joey Lin, SDSU, San Diego, CA ABSTRACT Regression analysis is one of the most used statistical methodologies. It can be used to describe or predict causal
More informationSTAT 501 EXAM I NAME Spring 1999
STAT 501 EXAM I NAME Spring 1999 Instructions: You may use only your calculator and the attached tables and formula sheet. You can detach the tables and formula sheet from the rest of this exam. Show your
More informationIn most cases, a plot of d (j) = {d (j) 2 } against {χ p 2 (1-q j )} is preferable since there is less piling up
THE UNIVERSITY OF MINNESOTA Statistics 5401 September 17, 2005 Chi-Squared Q-Q plots to Assess Multivariate Normality Suppose x 1, x 2,..., x n is a random sample from a p-dimensional multivariate distribution
More informationPrepared by: Prof. Dr Bahaman Abu Samah Department of Professional Development and Continuing Education Faculty of Educational Studies Universiti
Prepared by: Prof. Dr Bahaman Abu Samah Department of Professional Development and Continuing Education Faculty of Educational Studies Universiti Putra Malaysia Serdang Use in experiment, quasi-experiment
More informationMultivariate Tests. Mauchly's Test of Sphericity
General Model Within-Sujects Factors Dependent Variale IDLS IDLF IDHS IDHF IDHCLS IDHCLF Descriptive Statistics IDLS IDLF IDHS IDHF IDHCLS IDHCLF Mean Std. Deviation N.0.70.0.0..8..88.8...97 Multivariate
More informationAn Introduction to Multivariate Statistical Analysis
An Introduction to Multivariate Statistical Analysis Third Edition T. W. ANDERSON Stanford University Department of Statistics Stanford, CA WILEY- INTERSCIENCE A JOHN WILEY & SONS, INC., PUBLICATION Contents
More informationCanonical Correlations
Canonical Correlations Like Principal Components Analysis, Canonical Correlation Analysis looks for interesting linear combinations of multivariate observations. In Canonical Correlation Analysis, a multivariate
More informationMULTIVARIATE ANALYSIS OF VARIANCE
MULTIVARIATE ANALYSIS OF VARIANCE RAJENDER PARSAD AND L.M. BHAR Indian Agricultural Statistics Research Institute Library Avenue, New Delhi - 0 0 lmb@iasri.res.in. Introduction In many agricultural experiments,
More informationY (Nominal/Categorical) 1. Metric (interval/ratio) data for 2+ IVs, and categorical (nominal) data for a single DV
1 Neuendorf Discriminant Analysis The Model X1 X2 X3 X4 DF2 DF3 DF1 Y (Nominal/Categorical) Assumptions: 1. Metric (interval/ratio) data for 2+ IVs, and categorical (nominal) data for a single DV 2. Linearity--in
More informationJournal of Statistical Softw are
JSS Journal of Statistical Softw are MMMMMM, Volume 7, Issue II. http://www.jstatsoft.org/ Data Ellipses, HE Plots and Reduced-Rank Displays for Multivariate Linear Models: SAS Software and Examples Michael
More informationNeuendorf MANOVA /MANCOVA. Model: X1 (Factor A) X2 (Factor B) X1 x X2 (Interaction) Y4. Like ANOVA/ANCOVA:
1 Neuendorf MANOVA /MANCOVA Model: X1 (Factor A) X2 (Factor B) X1 x X2 (Interaction) Y1 Y2 Y3 Y4 Like ANOVA/ANCOVA: 1. Assumes equal variance (equal covariance matrices) across cells (groups defined by
More informationNeuendorf MANOVA /MANCOVA. Model: X1 (Factor A) X2 (Factor B) X1 x X2 (Interaction) Y4. Like ANOVA/ANCOVA:
1 Neuendorf MANOVA /MANCOVA Model: X1 (Factor A) X2 (Factor B) X1 x X2 (Interaction) Y1 Y2 Y3 Y4 Like ANOVA/ANCOVA: 1. Assumes equal variance (equal covariance matrices) across cells (groups defined by
More informationGeneral Linear Model
GLM V1 V2 V3 V4 V5 V11 V12 V13 V14 V15 /WSFACTOR=placeholders 2 Polynomial target 5 Polynomial /METHOD=SSTYPE(3) /EMMEANS=TABLES(OVERALL) /EMMEANS=TABLES(placeholders) COMPARE ADJ(SIDAK) /EMMEANS=TABLES(target)
More informationMultivariate Linear Models
Multivariate Linear Models Stanley Sawyer Washington University November 7, 2001 1. Introduction. Suppose that we have n observations, each of which has d components. For example, we may have d measurements
More informationPRINCIPAL COMPONENTS ANALYSIS
PRINCIPAL COMPONENTS ANALYSIS Iris Data Let s find Principal Components using the iris dataset. This is a well known dataset, often used to demonstrate the effect of clustering algorithms. It contains
More informationLeast Squares Estimation
Least Squares Estimation Using the least squares estimator for β we can obtain predicted values and compute residuals: Ŷ = Z ˆβ = Z(Z Z) 1 Z Y ˆɛ = Y Ŷ = Y Z(Z Z) 1 Z Y = [I Z(Z Z) 1 Z ]Y. The usual decomposition
More informationAn Introduction to Multivariate Methods
Chapter 12 An Introduction to Multivariate Methods Multivariate statistical methods are used to display, analyze, and describe data on two or more features or variables simultaneously. I will discuss multivariate
More informationClassification Methods II: Linear and Quadratic Discrimminant Analysis
Classification Methods II: Linear and Quadratic Discrimminant Analysis Rebecca C. Steorts, Duke University STA 325, Chapter 4 ISL Agenda Linear Discrimminant Analysis (LDA) Classification Recall that linear
More informationM A N O V A. Multivariate ANOVA. Data
M A N O V A Multivariate ANOVA V. Čekanavičius, G. Murauskas 1 Data k groups; Each respondent has m measurements; Observations are from the multivariate normal distribution. No outliers. Covariance matrices
More informationDescriptive Statistics
*following creates z scores for the ydacl statedp traitdp and rads vars. *specifically adding the /SAVE subcommand to descriptives will create z. *scores for whatever variables are in the command. DESCRIPTIVES
More informationChapter 5: Multivariate Analysis and Repeated Measures
Chapter 5: Multivariate Analysis and Repeated Measures Multivariate -- More than one dependent variable at once. Why do it? Primarily because if you do parallel analyses on lots of outcome measures, the
More informationTHE UNIVERSITY OF CHICAGO Booth School of Business Business 41912, Spring Quarter 2016, Mr. Ruey S. Tsay
THE UNIVERSITY OF CHICAGO Booth School of Business Business 41912, Spring Quarter 2016, Mr. Ruey S. Tsay Lecture 5: Multivariate Multiple Linear Regression The model is Y n m = Z n (r+1) β (r+1) m + ɛ
More informationSupervised Learning: Linear Methods (1/2) Applied Multivariate Statistics Spring 2012
Supervised Learning: Linear Methods (1/2) Applied Multivariate Statistics Spring 2012 Overview Review: Conditional Probability LDA / QDA: Theory Fisher s Discriminant Analysis LDA: Example Quality control:
More informationSTAT 501 Assignment 2 NAME Spring Chapter 5, and Sections in Johnson & Wichern.
STAT 01 Assignment NAME Spring 00 Reading Assignment: Written Assignment: Chapter, and Sections 6.1-6.3 in Johnson & Wichern. Due Monday, February 1, in class. You should be able to do the first four problems
More informationT. Mark Beasley One-Way Repeated Measures ANOVA handout
T. Mark Beasley One-Way Repeated Measures ANOVA handout Profile Analysis Example In the One-Way Repeated Measures ANOVA, two factors represent separate sources of variance. Their interaction presents an
More informationMultivariate Regression (Chapter 10)
Multivariate Regression (Chapter 10) This week we ll cover multivariate regression and maybe a bit of canonical correlation. Today we ll mostly review univariate multivariate regression. With multivariate
More informationTopic 20: Single Factor Analysis of Variance
Topic 20: Single Factor Analysis of Variance Outline Single factor Analysis of Variance One set of treatments Cell means model Factor effects model Link to linear regression using indicator explanatory
More informationMultivariate Linear Regression Models
Multivariate Linear Regression Models Regression analysis is used to predict the value of one or more responses from a set of predictors. It can also be used to estimate the linear association between
More informationNeuendorf MANOVA /MANCOVA. Model: MAIN EFFECTS: X1 (Factor A) X2 (Factor B) INTERACTIONS : X1 x X2 (A x B Interaction) Y4. Like ANOVA/ANCOVA:
1 Neuendorf MANOVA /MANCOVA Model: MAIN EFFECTS: X1 (Factor A) X2 (Factor B) Y1 Y2 INTERACTIONS : Y3 X1 x X2 (A x B Interaction) Y4 Like ANOVA/ANCOVA: 1. Assumes equal variance (equal covariance matrices)
More informationApplied Multivariate and Longitudinal Data Analysis
Applied Multivariate and Longitudinal Data Analysis Chapter 2: Inference about the mean vector(s) II Ana-Maria Staicu SAS Hall 5220; 919-515-0644; astaicu@ncsu.edu 1 1 Compare Means from More Than Two
More informationLecture 5: Hypothesis tests for more than one sample
1/23 Lecture 5: Hypothesis tests for more than one sample Måns Thulin Department of Mathematics, Uppsala University thulin@math.uu.se Multivariate Methods 8/4 2011 2/23 Outline Paired comparisons Repeated
More informationRejection regions for the bivariate case
Rejection regions for the bivariate case The rejection region for the T 2 test (and similarly for Z 2 when Σ is known) is the region outside of an ellipse, for which there is a (1-α)% chance that the test
More informationUV Absorbance by Fish Slime
Data Set 1: UV Absorbance by Fish Slime Statistical Setting This handout describes a repeated-measures ANOVA, with two crossed amongsubjects factors and repeated-measures on a third (within-subjects) factor.
More informationI L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN
Comparisons of Several Multivariate Populations Edps/Soc 584 and Psych 594 Applied Multivariate Statistics Carolyn J Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS
More informationIntroduction to machine learning and pattern recognition Lecture 2 Coryn Bailer-Jones
Introduction to machine learning and pattern recognition Lecture 2 Coryn Bailer-Jones http://www.mpia.de/homes/calj/mlpr_mpia2008.html 1 1 Last week... supervised and unsupervised methods need adaptive
More informationAnalysis of variance, multivariate (MANOVA)
Analysis of variance, multivariate (MANOVA) Abstract: A designed experiment is set up in which the system studied is under the control of an investigator. The individuals, the treatments, the variables
More information1998, Gregory Carey Repeated Measures ANOVA - 1. REPEATED MEASURES ANOVA (incomplete)
1998, Gregory Carey Repeated Measures ANOVA - 1 REPEATED MEASURES ANOVA (incomplete) Repeated measures ANOVA (RM) is a specific type of MANOVA. When the within group covariance matrix has a special form,
More informationPrincipal component analysis
Principal component analysis Motivation i for PCA came from major-axis regression. Strong assumption: single homogeneous sample. Free of assumptions when used for exploration. Classical tests of significance
More informationT.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS
ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS In our work on hypothesis testing, we used the value of a sample statistic to challenge an accepted value of a population parameter. We focused only
More informationMultivariate Analysis of Variance
Multivariate Analysis of Variance 1 Multivariate Analysis of Variance Objective of Multivariate Analysis of variance (MANOVA) is to determine whether the differences on criterion or dependent variables
More informationVisualizing Tests for Equality of Covariance Matrices Supplemental Appendix
Visualizing Tests for Equality of Covariance Matrices Supplemental Appendix Michael Friendly and Matthew Sigal September 18, 2017 Contents Introduction 1 1 Visualizing mean differences: The HE plot framework
More informationChapter 2 Multivariate Normal Distribution
Chapter Multivariate Normal Distribution In this chapter, we define the univariate and multivariate normal distribution density functions and then we discuss the tests of differences of means for multiple
More informationSTAT 730 Chapter 1 Background
STAT 730 Chapter 1 Background Timothy Hanson Department of Statistics, University of South Carolina Stat 730: Multivariate Analysis 1 / 27 Logistics Course notes hopefully posted evening before lecture,
More informationComparisons of Several Multivariate Populations
Comparisons of Several Multivariate Populations Edps/Soc 584, Psych 594 Carolyn J Anderson Department of Educational Psychology I L L I N O I S university of illinois at urbana-champaign c Board of Trustees,
More informationTHE UNIVERSITY OF CHICAGO Booth School of Business Business 41912, Spring Quarter 2012, Mr. Ruey S. Tsay
THE UNIVERSITY OF CHICAGO Booth School of Business Business 41912, Spring Quarter 2012, Mr. Ruey S. Tsay Lecture 3: Comparisons between several multivariate means Key concepts: 1. Paired comparison & repeated
More information6 Multivariate Regression
6 Multivariate Regression 6.1 The Model a In multiple linear regression, we study the relationship between several input variables or regressors and a continuous target variable. Here, several target variables
More informationM M Cross-Over Designs
Chapter 568 Cross-Over Designs Introduction This module calculates the power for an x cross-over design in which each subject receives a sequence of treatments and is measured at periods (or time points).
More informationGregory Carey, 1998 Regression & Path Analysis - 1 MULTIPLE REGRESSION AND PATH ANALYSIS
Gregory Carey, 1998 Regression & Path Analysis - 1 MULTIPLE REGRESSION AND PATH ANALYSIS Introduction Path analysis and multiple regression go hand in hand (almost). Also, it is easier to learn about multivariate
More informationMultivariate Data Analysis Notes & Solutions to Exercises 3
Notes & Solutions to Exercises 3 ) i) Measurements of cranial length x and cranial breadth x on 35 female frogs 7.683 0.90 gave x =(.860, 4.397) and S. Test the * 4.407 hypothesis that =. Using the result
More informationQuiz #3 Research Hypotheses that Involve Comparing Non-Nested Models
Quiz #3 Research Hypotheses that Involve Comparing Non-Nested Models The researcher also wanted to test the hypothesis that students with internal versus external locus of control could be better distinguished
More informationHypothesis Testing for Var-Cov Components
Hypothesis Testing for Var-Cov Components When the specification of coefficients as fixed, random or non-randomly varying is considered, a null hypothesis of the form is considered, where Additional output
More informationLEC 4: Discriminant Analysis for Classification
LEC 4: Discriminant Analysis for Classification Dr. Guangliang Chen February 25, 2016 Outline Last time: FDA (dimensionality reduction) Today: QDA/LDA (classification) Naive Bayes classifiers Matlab/Python
More informationInferences about a Mean Vector
Inferences about a Mean Vector Edps/Soc 584, Psych 594 Carolyn J. Anderson Department of Educational Psychology I L L I N O I S university of illinois at urbana-champaign c Board of Trustees, University
More informationHYPOTHESIS TESTING: THE CHI-SQUARE STATISTIC
1 HYPOTHESIS TESTING: THE CHI-SQUARE STATISTIC 7 steps of Hypothesis Testing 1. State the hypotheses 2. Identify level of significant 3. Identify the critical values 4. Calculate test statistics 5. Compare
More informationMultivariate Statistics
Multivariate Statistics Chapter 2: Multivariate distributions and inference Pedro Galeano Departamento de Estadística Universidad Carlos III de Madrid pedro.galeano@uc3m.es Course 2016/2017 Master in Mathematical
More informationApplied Multivariate Analysis
Department of Mathematics and Statistics, University of Vaasa, Finland Spring 2017 Dimension reduction Principal Component Analysis (PCA) The problem in exploratory multivariate data analysis usually is
More informationYou can compute the maximum likelihood estimate for the correlation
Stat 50 Solutions Comments on Assignment Spring 005. (a) _ 37.6 X = 6.5 5.8 97.84 Σ = 9.70 4.9 9.70 75.05 7.80 4.9 7.80 4.96 (b) 08.7 0 S = Σ = 03 9 6.58 03 305.6 30.89 6.58 30.89 5.5 (c) You can compute
More informationPRINCIPAL COMPONENTS ANALYSIS (PCA)
PRINCIPAL COMPONENTS ANALYSIS (PCA) Introduction PCA is considered an exploratory technique that can be used to gain a better understanding of the interrelationships between variables. PCA is performed
More informationSAS Syntax and Output for Data Manipulation: CLDP 944 Example 3a page 1
CLDP 944 Example 3a page 1 From Between-Person to Within-Person Models for Longitudinal Data The models for this example come from Hoffman (2015) chapter 3 example 3a. We will be examining the extent to
More informationPOWER AND TYPE I ERROR RATE COMPARISON OF MULTIVARIATE ANALYSIS OF VARIANCE
POWER AND TYPE I ERROR RATE COMPARISON OF MULTIVARIATE ANALYSIS OF VARIANCE Supported by Patrick Adebayo 1 and Ahmed Ibrahim 1 Department of Statistics, University of Ilorin, Kwara State, Nigeria Department
More informationTHE UNIVERSITY OF CHICAGO Graduate School of Business Business 41912, Spring Quarter 2008, Mr. Ruey S. Tsay. Solutions to Final Exam
THE UNIVERSITY OF CHICAGO Graduate School of Business Business 41912, Spring Quarter 2008, Mr. Ruey S. Tsay Solutions to Final Exam 1. (13 pts) Consider the monthly log returns, in percentages, of five
More informationTechniques and Applications of Multivariate Analysis
Techniques and Applications of Multivariate Analysis Department of Statistics Professor Yong-Seok Choi E-mail: yschoi@pusan.ac.kr Home : yschoi.pusan.ac.kr Contents Multivariate Statistics (I) in Spring
More informationTHE UNIVERSITY OF CHICAGO Graduate School of Business Business 41912, Spring Quarter 2014, Mr. Ruey S. Tsay. Solutions to Final Exam
THE UNIVERSITY OF CHICAGO Graduate School of Business Business 41912, Spring Quarter 2014, Mr. Ruey S. Tsay Solutions to Final Exam 1. City crime: The distance matrix is 694 915 1073 528 716 881 972 464
More informationOdor attraction CRD Page 1
Odor attraction CRD Page 1 dm'log;clear;output;clear'; options ps=512 ls=99 nocenter nodate nonumber nolabel FORMCHAR=" ---- + ---+= -/\*"; ODS LISTING; *** Table 23.2 ********************************************;
More informationTopic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model
Topic 17 - Single Factor Analysis of Variance - Fall 2013 One way ANOVA Cell means model Factor effects model Outline Topic 17 2 One-way ANOVA Response variable Y is continuous Explanatory variable is
More information4 Statistics of Normally Distributed Data
4 Statistics of Normally Distributed Data 4.1 One Sample a The Three Basic Questions of Inferential Statistics. Inferential statistics form the bridge between the probability models that structure our
More informationData Mining and Analysis: Fundamental Concepts and Algorithms
Data Mining and Analysis: Fundamental Concepts and Algorithms dataminingbook.info Mohammed J. Zaki 1 Wagner Meira Jr. 2 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY, USA
More information