An Introduction to Ordination Connie Clark

Similar documents
Rigid rotation of nonmetric multidimensional scaling axes to environmental congruence

Canonical Correlation & Principle Components Analysis

-Principal components analysis is by far the oldest multivariate technique, dating back to the early 1900's; ecologists have used PCA since the

Introduction to ordination. Gary Bradfield Botany Dept.

Multivariate Analysis of Ecological Data using CANOCO

Unconstrained Ordination

BIO 682 Multivariate Statistics Spring 2008

ANOVA approach. Investigates interaction terms. Disadvantages: Requires careful sampling design with replication

Experimental Design and Data Analysis for Biologists

Introduction to multivariate analysis Outline

INTRODUCTION TO MULTIVARIATE ANALYSIS OF ECOLOGICAL DATA

Principal Component Analysis (PCA) Theory, Practice, and Examples

Ordination & PCA. Ordination. Ordination

Multivariate Data Analysis a survey of data reduction and data association techniques: Principal Components Analysis

4. Ordination in reduced space

Discrimination Among Groups. Discrimination Among Groups

4/2/2018. Canonical Analyses Analysis aimed at identifying the relationship between two multivariate datasets. Cannonical Correlation.

Lecture 5: Ecological distance metrics; Principal Coordinates Analysis. Univariate testing vs. community analysis

Multivariate Statistics Summary and Comparison of Techniques. Multivariate Techniques

Lecture 5: Ecological distance metrics; Principal Coordinates Analysis. Univariate testing vs. community analysis

VarCan (version 1): Variation Estimation and Partitioning in Canonical Analysis

Inconsistencies between theory and methodology: a recurrent problem in ordination studies.

BIOL 580 Analysis of Ecological Communities

1.2. Correspondence analysis. Pierre Legendre Département de sciences biologiques Université de Montréal

Machine Learning (Spring 2012) Principal Component Analysis

Linking species-compositional dissimilarities and environmental data for biodiversity assessment

Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data.

Bootstrapped ordination: a method for estimating sampling effects in indirect gradient analysis

Chapter 11 Canonical analysis

Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations.

Dimension Reduction Techniques. Presented by Jie (Jerry) Yu

Multivariate Ordination Analyses: Principal Component Analysis. Dilys Vela

Table of Contents. Multivariate methods. Introduction II. Introduction I

DIMENSION REDUCTION AND CLUSTER ANALYSIS

Statistical Analysis of fmrl Data

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

Face Recognition Using Laplacianfaces He et al. (IEEE Trans PAMI, 2005) presented by Hassan A. Kingravi

FACTOR ANALYSIS AND MULTIDIMENSIONAL SCALING

MS-E2112 Multivariate Statistical Analysis (5cr) Lecture 6: Bivariate Correspondence Analysis - part II

ECE 521. Lecture 11 (not on midterm material) 13 February K-means clustering, Dimensionality reduction

Chapter 1 Ordination Methods and the Evaluation of Ediacaran Communities

BIOL 580 Analysis of Ecological Communities

Statistical Pattern Recognition

8. FROM CLASSICAL TO CANONICAL ORDINATION

Dimension Reduction and Classification Using PCA and Factor. Overview

PCA Advanced Examples & Applications

Maximum variance formulation

Multidimensional scaling (MDS)

EXAM PRACTICE. 12 questions * 4 categories: Statistics Background Multivariate Statistics Interpret True / False

Eigenvalues, Eigenvectors, and an Intro to PCA

Multivariate Statistics 101. Ordination (PCA, NMDS, CA) Cluster Analysis (UPGMA, Ward s) Canonical Correspondence Analysis

Principal Component Analysis, A Powerful Scoring Technique

Principal component analysis

Why Is It There? Attribute Data Describe with statistics Analyze with hypothesis testing Spatial Data Describe with maps Analyze with spatial analysis

Machine Learning. Principal Components Analysis. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012

SPECIES RESPONSE CURVES! Steven M. Holland" Department of Geology, University of Georgia, Athens, GA " !!!! June 2014!

Generalized Linear Models (GLZ)

FINAL EXAM Ma (Eakin) Fall 2015 December 16, 2015

Principal Components Analysis (PCA)

MULTIVARIATE ANALYSIS OF VARIANCE

Robustness of Principal Components

Dimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes. October 3, Statistics 202: Data Mining

Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA

Unsupervised learning: beyond simple clustering and PCA

Multivariate Analysis of Ecological Data

series. Utilize the methods of calculus to solve applied problems that require computational or algebraic techniques..

CS281 Section 4: Factor Analysis and PCA

Principal Component Analysis

ROBERTO BATTITI, MAURO BRUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Apr 2015

A COMPARISON OF THREE MULTIVARIATE STATISTICAL TECHNIQUES FOR THE ANALYSIS OF AVIAN FORAGING DATA

What is Principal Component Analysis?

Mathematics with Maple

Proximity data visualization with h-plots

1. Introduction to Multivariate Analysis

LECTURE 4 PRINCIPAL COMPONENTS ANALYSIS / EXPLORATORY FACTOR ANALYSIS

A Theory of Gradient Analysis

Principal Component Analysis -- PCA (also called Karhunen-Loeve transformation)

Machine Learning 2nd Edition

Introduction to Machine Learning

Vegetation Change Detection of Central part of Nepal using Landsat TM

Factors affecting the Power and Validity of Randomization-based Multivariate Tests for Difference among Ecological Assemblages

Figure 43 - The three components of spatial variation

Eigenvalues, Eigenvectors, and an Intro to PCA

Matrix Vector Products

Linear Dimensionality Reduction

Eigenvalues, Eigenvectors, and an Intro to PCA

Statistics 202: Data Mining. c Jonathan Taylor. Week 2 Based in part on slides from textbook, slides of Susan Holmes. October 3, / 1

Part I. Other datatypes, preprocessing. Other datatypes. Other datatypes. Week 2 Based in part on slides from textbook, slides of Susan Holmes

THE OBJECTIVE FUNCTION OF PARTIAL LEAST SQUARES REGRESSION

September 16, 2004 The NEURON Book: Chapter 2

Variations in pelagic bacterial communities in the North Atlantic Ocean coincide with water bodies

CS168: The Modern Algorithmic Toolbox Lecture #8: PCA and the Power Iteration Method

Multiple regression and inference in ecology and conservation biology: further comments on identifying important predictor variables

DETECTING BIOLOGICAL AND ENVIRONMENTAL CHANGES: DESIGN AND ANALYSIS OF MONITORING AND EXPERIMENTS (University of Bologna, 3-14 March 2008)

6. Spatial analysis of multivariate ecological data

Algebra of Principal Component Analysis

Dimension Reduction and Low-dimensional Embedding

Principal Component Analysis

MS-E2112 Multivariate Statistical Analysis (5cr) Lecture 8: Canonical Correlation Analysis

Appendix A : rational of the spatial Principal Component Analysis

Transcription:

An Introduction to Ordination Connie Clark Ordination is a collective term for multivariate techniques that adapt a multidimensional swarm of data points in such a way that when it is projected onto a two-dimensional space any intrinsic pattern the data may possess becomes apparent upon visual inspection (Pielou, 1984). Basically, ordination serves to summarize community data (such as species abundance data) by producing a low-dimensional ordination space in which similar species and samples are plotted close together, and dissimilar species and samples are placed far apart. Generally, ordination techniques are used to describe relationships between species composition patterns and the underlying environmental gradients that influence these patterns (asking, what factors structure the community?). For example, if you wanted to examine the distribution patterns of tree species in the Sierra Nevada Mt. Range, ordination could be used to determine which species are commonly found associated with one another, and how the species composition of the community changes with increase in elevation. Recently, use of ordination techniques have expanded to include analysis of dietary overlap (Schluter and Grant, 1982), and to explore patterns of within species morphological differences with geographic distance between populations (Alisauskas, 1998). Data Commonly, data interpreted using ordination are collected in a species by sample data matrix, similar to the matrix presented below. Sample data may include measures of density, biomass, frequency, importance values, presence/absence, or any number of abundance measures.

E7000 E6580 E6000 E5400 E5000 E4000 E2850 E1800 ABMA 88.6 144.4 21.7 52.2 0 0 0 0 ABCO 211.4 149.3 243.2 190.5 102.4 12.4 18.7 0 ACMA 0 0 0 0 0 13.1 5.5 0 ARME 0 0 0 0 0 0 0 34.9 CADE 0 0 0 3 65 33.8 36.4 28.7 CONU 0 0 0 0 0 0 0 11 LIDE 0 0 0 0 0 2.4 0 136.6 PICO 0 0 11.2 2.2 0 0 14.7 0 PILA 0 0 0 4 0 0 0 12.9 PIPO 0 0 0.8 3.5 10 85.1 23.5 64.4 PIJE 0 6.4 9.8 28.1 16.7 0 0 0 PSME 0 0 0 17.1 105.5 48.8 125 5.5 QUCH 0 0 0 0 0 52.2 19.6 0 QUWI 0 0 0 0 0 10 7.7 0 QUKE 0 0 0 0 0 47.5 46.4 0 The above is a relatively simple data set. However, it is easy to imagine that a true data set may encounter dozens of species over hundreds of samples. Complex sample by species matrices represent dozens to hundreds of dimensions that are impossible to visualize or interpret. Even graphed, species response curves of large community data sets can be nearly impossible to interpret. (As they resemble a mess of overlapping peaks and depressions as shown here.)

Ordination can help us find structure in these complicated data sets. By using various mathematical calculations (which will not be discussed here), ordination techniques will identify similarity between species and samples. Results are then projected onto two dimensions in such a way that species and samples most similar to one another will be close together, and species and samples most dissimilar from one another will appear farther apart (as shown below). Ordination techniques: There are several different ordination techniques, all of which differ slightly, in the mathematical approach used to calculate species and sample similarity/dissimiarity. Rather than reinventing the wheel by discussing each of these techniques in depth, I will offer only a brief description of the most commonly used methods here. Further details can be found in the following suggested references: Gauch, H. G., Jr. 1982. Multivariate Analysis in Community Structure. Cambridge University Press, Cambridge Causton, D. R. 1988. An introduction to vegetation analysis. Unwin Hyman, London. Kent, M., and P. Coker. 1992. Vegetation description and analysis: a practical approach. Belhaven Press, London.

Pielou, E. C. 1984. The Interpretation of Ecological Data: A Primer on Classification and Ordination. Wiley, New York Okland, R. H. 1990. Vegetation ecology: theory, methods and applications with reference to Fennoscandia. Sommerfeltia Supplement 1:1-233. Jongman, R. H. G., C. J. F. ter Braak, and O. F. R. van Tongeren, editors. 1987. Data Analysis in Community and Landscape Ecology. Pudoc, Wageningen, The Netherlands. Analysis of Ecological Communities. Chapman and Hall, London. Web Links The Ordination Webpage http://www.okstate.edu/artsci/botany/ordinate/ Note: this web site comes highly recommended as it provides detailed yet simple explanations of most currently used ordination techniques (see the Indirect Gradient Analysis section of above mentioned web page). In the General Reference section of the web site, Palmer offers a fantastic glossary for terms used in ordination, and clarifies some common confusion in the terminology used to date. In addition, he provides links to other ordination sites and offers addresses for software links. In the Statistics and Background section of the site, read through Centroids and Inertia, Similarity, Distance and Difference, and Explorations in Coenspace for the conceptual background necessary in understanding ordination techniques. The Direct Gradient Analysis section will be of interest if you have specific environmental data collected in addition to abundance and species data. You may find this to be a stronger approach to the analysis of your data set. Ecological Data, Transformations and Standardization is for more advanced users who likely have an understanding of ordination and seek more advanced information regarding data manipulation. Principal Components Analysis (PCA) PCA was one of the earliest ordination techniques applied to ecological data. PCA uses a rigid rotation to derive orthogonal axes, which maximize the variance in the data set. Both species and sample ordinations result from a single analysis. Computationally, PCA is basically an eigenanalysis. The sum of the eigenvalues will equal the sum of the variance of all variables in the data set. PCA is relatively objective and provides a reasonable but crude indication of relationships. For further computational detail click here. Reciprocal Averaging (RA) RA is an ordination technique related conceptually to weighted averages.

However, computationally, RA is related to eigenvector ordinations. RA places sampling units and species on the same gradients, and maximizes variation between species and sample scores using a correlation coefficient. It serves as a relatively objective analysis of community data. Results are generally superior to the results from PCA. However, RA axis ends are compressed relative to the middle, and the second axis is often a distortion of the first axis, resulting in an arched effect. Detrended Correspondence Analysis (DCA) DCA is an eigenvector ordination technique based on Reciprocal Averaging, correcting for the arch effect produced from RA. Hill and Gauch (1980) report DCA results are superior to those of RA. Other ecologists criticize the detrending process of DCA. DCA is widely used for the analysis of community data along gradients. It has also been found effective for niche ordination of birds by foraging heights (Sabo 1980). DCA ordinates samples and species simultaneously. It is not appropriate for the analysis of a matrix of similarity values between community data (Gauch, 1982). Nonmetric Multidimensional Scaling (NMS) NMS actually refers to an entire related family of ordination techniques. These techniques use rank order information to identify similarity in a data set. NMS is a truly nonparametric ordination method which seeks to best reduce space portrayal of relationships. The verdict is still out on this type of ordination. Gauch (1982) claims NMS is not worth the extra computational effort, and that it gives effective results only for easy data sets with low diversity. Others hold NMS is extremely effective (Kenkel and Orloci, 1986, Bradfield and Kenkel, 1987). Appropriate uses of ordination: It is important to keep in mind that the purpose of ordination is to assist a researcher to find pattern in data sets that are otherwise too complicated to interpret. A good ordination technique will be able to identify the most important dimensions in a data set, and ignore the "noise", in order to show these patterns. However, ordination techniques should not be used in hypothesis driven analysis. They are meant as exploratory tools. Thus, post-hoc analysis is acceptable, and many different techniques can be tried on the same data set. No null hypothesis can be rejected, nor are p- values generated to test statistical significance. When p-values are offered, they can only be used as a rough guide or indicator of underlying processes that MAY BE explaining community patterns. Bibliography

Alisauskas, R. T. 1998. Winter range expansion and relationships between landscape and morphometrics of midcontinent Lesser Snow Geese. Auk 115: 851-862. Brandfield, G. E., and N. C. Kenkel. 1987. Nonlinear ordination using flexible shortest path adjustment of ecological distances. Ecology 68:750-753. Gauch, H. G., Jr. 1982. Multivariate Analysis in Community Structure. Cambridge University Press, Cambridge. Hill, M. O. and Gauch, H. G. 1980. Deterended correspondence analysis, an improved ordination technique. Vegetatio 42:47-58. Kenkel, N. C., and L. Orloci. 1986. Applying metric and nonmetric multidimensional scaling to ecological studies: some new results. Ecology 67:919-928. Pielou, E. C. 1984. The Interpretation of Ecological Data: A Primer on Classification and Ordination. Wiley, New York. Sabo, S. R. 1980. Niche and habitat relations in subalpine bird communities of the White Mountains of New Hampshire. Ecological Monographs 50:241-259. Schluter. D., and P. R. Grant. 1982. The distribution of Geospiza difficilis on Galapagos islands: test of three hypotheses. Evolution 36:1213-1226.