Multivariate Statistics Summary and Comparison of Techniques P The key to multivariate statistics is understanding conceptually the relationship among techniques with regards to: < The kinds of problems each technique is suited for < The objective(s) of each technique < The data structure required for each technique < Sampling considerations for each technique < Underlying mathematical model, or lack thereof, of each technique < Potential for complementary use of techniques 1 Model y1 + y2 +... + yi y 1 + y 2 +... + y i = x y 1 + y 2 +... + y i = x 1 + x 2 +... + x j Techniques Unconstrained Ordination Multivariate ANOVA Multi-Response Permutation Analysis of Similarities Mantel Test Discriminant Analysis Logistic Regression Classification Trees Indicator Species Analysis Constrained Ordination Canonical Correlation Multivariate Regression Trees 2
Technique Objective Unconstrained Ordination (PCA, MDS, CA, DCA, NMDS) (Family of techinques) Discrimination (MANOVA, MRPP, ANOSIM, Mantel, DA, LR, CART, ISA) Constrained Ordination (RDA, CCA, CAP) 3 Extract gradients of maximum variation Establish groups of similar entities Test for & describe differences among groups of entities or predict group membership Extract gradients of variation in dependent variables explainable by independent variables Technique Unconstrained Ordination (PCA, MDS, CA, DCA, NMDS) (Family of techinques) Discrimination (MANOVA, MRPP, ANOSIM, Mantel, DA, LR, CART, ISA) Constrained Ordination (RDA, CCA, CAP) Variance Emphasis Emphasizes variation among individual sampling entities by defining gradients of maximum total sample variance; describes the interentity variance structure. 4
Technique Unconstrained Ordination (PCA, MDS, CA, DCA, NMDS) (Family of techinques) Discrimination (MANOVA, MRPP, ANOSIM, Mantel, DA, LR, CART, ISA) Constrained Ordination (RDA, CCA, CAP) Variance Emphasis Emphasizes both differences and similarities among individual sampling entities by clustering entities based on inter-entity resemblance. 5 Technique Unconstrained Ordination (PCA, MDS, CA, DCA, NMDS) (Family of techinques) Discrimination (MANOVA, MRPP, ANOSIM, Mantel, DA, LR, CART, ISA) Constrained Ordination (RDA, CCA, CAP) Variance Emphasis Emphasizes variation among groups of sampling entities; describes the inter-group variance structure. 6
Technique Unconstrained Ordination (PCA, MDS, CA, DCA, NMDS) (Family of techinques) Discrimination (MANOVA, MRPP, ANOSIM, Mantel, DA, LR, CART, ISA) Constrained Ordination (RDA, CCA, CAP) Variance Emphasis Emphasizes variation among individual sampling entities by defining gradients of maximum total sample variance explainable by environmental variables 7 Technique Unconstrained Ordination (PCA, MDS, CA, DCA, NMDS) (Family of techinques) Discrimination (MANOVA, MRPP, ANOSIM, Mantel, DA, LR, CART, ISA) Constrained Ordination (RDA, CCA, CAP) Dependence Type Interdependence Interdependence Dependence Dependence (& interdependence?) 8
Technique Unconstrained Ordination (PCA, MDS, CA, DCA, NMDS) (Family of techinques) Discrimination (MANOVA, MRPP, ANOSIM, Mantel, DA, LR, CART, ISA) Constrained Ordination (RDA, CCA, CAP) Data Structure One set; >>2 variables (Y) One set; >>2 varibles (Y) Two sets; 1 grouping variable (X), >>2 discriminating variables (Y) Two sets; >>2 response (Y) variables, $1 explanatory (X) variables 9 Obs Group Y-set X-set 1 A a 11 a 12 a 13... a 1p b 11 b 12 b 13... b 1m 2 A a 21 a 22 a 23... a 2p b 21 b 22 b 23... b 2m 3 A a 31 a 32 a 33... a 3p b 31 b 32 b 33... b 3m............................ n A a n1 a n2 a n3... a np b n1 b n2 b n3... b nm n+1 C c 11 c 12 c 13... c 1p n+2 C c 21 c 22 c 23... c 2p n+3 C c 31 c 32 c 33... c 3p................ N C c n1 c n2 c n3... c np Unconstrained Ordination (PCA, PCO, CA, DCA, NMDS) (Family of techinques) 10
Obs Group Y-set X-set 1 A a 11 a 12 a 13... a 1p b 11 b 12 b 13... b 1m 2 A a 21 a 22 a 23... a 2p b 21 b 22 b 23... b 2m 3 A a 31 a 32 a 33... a 3p b 31 b 32 b 33... b 3m............................ n A a n1 a n2 a n3... a np b n1 b n2 b n3... b nm n+1 C c 11 c 12 c 13... c 1p n+2 C c 21 c 22 c 23... c 2p n+3 C c 31 c 32 c 33... c 3p................ N C c n1 c n2 c n3... c np Discrimination Techniques (MANOVA, MRPP, ANOSIM, Mantel; DA, LR, CART, ISA) 11 Obs Group Y-set X-set 1 A a 11 a 12 a 13... a 1p b 11 b 12 b 13... b 1m 2 A a 21 a 22 a 23... a 2p b 21 b 22 b 23... b 2m 3 A a 31 a 32 a 33... a 3p b 31 b 32 b 33... b 3m............................ n A a n1 a n2 a n3... a np b n1 b n2 b n3... b nm n+1 C c 11 c 12 c 13... c 1p n+2 C c 21 c 22 c 23... c 2p n+3 C c 31 c 32 c 33... c 3p................ N C c n1 c n2 c n3... c np Constrained Ordination (RDA, CCA, CAP, COR) MRT 12
Technique Unconstrained Ordination (PCA, MDS, CA, DCA, NMDS) (Family of techinques) Discrimination (MANOVA, MRPP, ANOSIM, Mantel, DA, LR, CART, ISA) Constrained Ordination (RDA, CCA, CAP) Sample Characteristics N (from one or unknown # pops) N (from unknown # pop's) N (from known # pop's) or N1, N2,... (from separate pop's) N (from one pop) 13 P Describe the major ecological gradients of variation among individual sampling entities, and/or to portray sampling entities along "continuous" gradients of maximum sample variation, then use... Unconstrained Ordination 14
P Describe the major ecological gradients of variation among individual sampling entities, and/or to portray sampling entities along "continuous" gradients of maximum sample variation, then use... P Assume linear relationship to ecological gradients... Unconstrained Ordination PCA, PCO(MDS) 15 P Describe the major ecological gradients of variation among individual sampling entities, and/or to portray sampling entities along "continuous" gradients of maximum sample variation, then use... P Assume linear relationship to ecological gradients... P Assume unimodal relationship to ecological gradients... Unconstrained Ordination PCA, PCO(MDS) CA(RA), DCA 16
P Describe the major ecological gradients of variation among individual sampling entities, and/or to portray sampling entities along "continuous" gradients of maximum sample variation, then use... P Assume linear relationship to ecological gradients... P Assume unimodal relationship to ecological gradients... P Assume no particular relationship; only monotonic relationship between input and output dissimilarities... Unconstrained Ordination PCA, PCO(MDS) CA(RA), DCA Alt: PCA* NMDS 17 P Establish artificial classes or groups of similar entities where pre-specified, welldefined groups do not already exist, and/or to portray sampling entities in "discrete" groups, then use... 18
P Establish artificial classes or groups of similar entities where pre-specified, welldefined groups do not already exist, and/or to portray sampling entities in "discrete" groups, then use... P Assign entities to a specified number of groups to maximize within-group similarity or form composite clusters... Non-hierarchical 19 P Establish artificial classes or groups of similar entities where pre-specified, welldefined groups do not already exist, and/or to portray sampling entities in "discrete" groups, then use... P Assign entities to a specified number of groups to maximize within-group similarity or form composite clusters... P Assign entities to groups and display relationships among groups as they form... 20 Non-hierarchical Hierarchical
P Establish artificial classes or groups of entities with similar species composition and abundance where pre-specified, well-defined groups do not already exist, based on measured environmental variables, and/or to portray sampling entities in "discrete" groups representing species assemblages with distinct environmental affinities, then use... Constrained (MRT) 21 P Differentiate among pre-specified, well-defined classes or groups of sampling entities, and to: P Test for significant differences among groups... < Parametric test... MANOVA / DA 22
P Differentiate among pre-specified, well-defined classes or groups of sampling entities, and to: P Test for significant differences among groups... < Parametric test... < Nonparametric tests... MANOVA / DA MRPP, ANOSIM, Mantel 23 P Differentiate among pre-specified, well-defined classes or groups of sampling entities, and to: P Describe the major ecological differences among groups... < Assume a linear discrimination function... DA (CAD) 24
P Differentiate among pre-specified, well-defined classes or groups of sampling entities, and to: P Describe the major ecological differences among groups... < Assume a linear discrimination function... < Assume a logistic discrimination function... DA (CAD) LR (MLR) 25 P Differentiate among pre-specified, well-defined classes or groups of sampling entities, and to: P Describe the major ecological differences among groups... < Assume a linear discrimination function... < Assume a logistic discrimination function... < Do not assume any particular function... DA (CAD) LR (MLR) CART (UCT) 26
P Differentiate among pre-specified, well-defined classes or groups of sampling entities, and to: P Describe the major ecological differences among groups... < Assume a linear discrimination function... < Assume a logistic discrimination function... < Do not assume any particular function... < Identify indicators for each group... DA (CAD) LR (MLR) CART (UCT) ISA 27 P Differentiate among pre-specified, well-defined classes or groups of sampling entities, and to: P Predict group membership of future observations... < Linear classification function... DA (LDF) 28
P Differentiate among pre-specified, well-defined classes or groups of sampling entities, and to: P Predict group membership of future observations... < Linear classification function... < Logistic classification function... DA (LDF) LR (MLR) 29 P Differentiate among pre-specified, well-defined classes or groups of sampling entities, and to: P Predict group membership of future observations... < Linear classification function... < Logistic classification function... < Decision tree classifier... DA (LDF) LR (MLR) CART (UCT) 30
P Differentiate among pre-specified, well-defined classes or groups of sampling entities, and to: P Predict group membership of future observations... < Linear classification function... < Logistic classification function... < Decision tree classifier... < Other nonparametric classifiers... DA (LDF) LR (MLR) CART (UCT) Kernel K nearest-neighbor 31 P Explain the variation in a single continuous dependent variable using two or more continuous independent variables, and/or to develop a model for predicting the value of the dependent variable from the values of the independent variables, then use... Alternatives: Multiple Linear Regression CART (URT) 32
P Explain the variation in a single dichotomous dependent (grouping) variable using two or more continuous and/or categorical independent variables, and/or to develop a model for predicting the group membership of a sampling entity from the values of the independent variables, then use... Alternatives: Multiple Logistic Regression DA CART (UCT) 33 P Describe the major ecological patterns in one set of (response) variables explainable by another set of (explanatory) variables, then use... Constrained Ordination or MRT 34
P Describe the major ecological patterns in one set of (response) variables explainable by another set of (explanatory) variables, then use... P Assume linear response function of response variables (species) along linear gradients defined by the explanatory variables (environment)... Constrained Ordination or MRT RDA, CAP 35 P Describe the major ecological patterns in one set of (response) variables explainable by another set of (explanatory) variables, then use... P Assume linear response function of response variables (species) along linear gradients defined by the explanatory variables (environment)... P Assume unimodal response function of response variables (species) along linear gradients defined by the explanatory variables (environment)... Constrained Ordination or MRT RDA, CAP CCA, DCCA Alt: RDA* 36
P Describe the major ecological patterns in one set of (response) variables explainable by another set of (explanatory) variables, then use... P Assume linear response function of response variables (species) along linear gradients defined by the explanatory variables (environment)... P Assume unimodal response function of response variables (species) along linear gradients defined by the explanatory variables (environment)... P Do not assume any response function... Constrained Ordination or MRT RDA, CAP CCA, DCCA MRT 37 P Determine the magnitude of the ecological relationships between two sets of variables expressed as distance matrices; i.e., dissimilarities between samples, then use... P Determine the magnitude of the ecological relationships between two sets of variables expressed as distance matrices after accounting for a third set of variables (i.e., Y~X Z), then use... Mantel Test Partial Mantel Test 38
Dependence Techniques Independent Variables 39 Dependence Techniques CT = Contingency tables SLR = Simple logistic regression MLR = Multiple logistic regression SRA = Simple linear regression MRA = Multiple linear regression T-test = T-test ANOVA = Analysis of variance UCT = Univar. classification trees URT = Univar. regression trees T 2 -test MANOVA = Multivariate analysis of variance DA = Discriminant analysis ISA RDA = Redundancy analysis CCA = Can. correspond. analysis COR = Canonical corr. analysis MRT = Hotelling s T 2 = Indicator species analysis CAP = Can. prin. coord. analysis = Multivar. regression trees 40
Dependence Techniques Independent Variables CT CT CT SLR SLR MLR UCT CT CT CT UCT DA DA SLR MLR UCT UCT DA DA CT RDA CT RDA CT RDA RDA DA CAP DA CAP MRT CAP CAP MRT CCA CCA COR CCA CCA COR SRA SRA MRA SRA MRA T-test ANOVA ANOVA URT URT T 2 -test RDA Manova RDA Manova RDA RDA DA CAP DA CAP MRT CAP CAP MRT ISA CCA ISA CCA COR CCA CCA COR RDA CAP CCA RDA CAP CCA 41 Advantages of Multivariate Statistics P Reflect more accurately the true multidimensional, multivariate nature of natural systems. P Provide a way to handle large data sets with large numbers of variables. P Provide a way of summarizing redundancy in large data sets. P Provide rules for combining variables in an "optimal" way. 42
Advantages of Multivariate Statistics P Provide a solution to the multiple comparison problem by controlling experimentwise error rate. P Provide a means of detecting and quantifying truly multivariate patterns that arise out of the correlational structure of the variable set. P Provide a means of exploring complex data sets for patterns and relationships from which hypotheses can be generated and subsequently tested experimentally. 43