Linking species-compositional dissimilarities and environmental data for biodiversity assessment

Similar documents
Phylogenetic diversity and conservation

Spatial non-stationarity, anisotropy and scale: The interactive visualisation of spatial turnover

Multivariate Statistics 101. Ordination (PCA, NMDS, CA) Cluster Analysis (UPGMA, Ward s) Canonical Correspondence Analysis

Multivariate Analysis of Ecological Data

An Introduction to Ordination Connie Clark

Compositional dissimilarity as a robust measure of ecological distance

Experimental Design and Data Analysis for Biologists

Multivariate Analysis of Ecological Data using CANOCO

Characterizing and predicting cyanobacterial blooms in an 8-year

A diversity of beta diversities: straightening up a concept gone awry. Part 2. Quantifying beta diversity and related phenomena

-Principal components analysis is by far the oldest multivariate technique, dating back to the early 1900's; ecologists have used PCA since the

A Comparison of Rainfall Estimation Techniques

Some thoughts on the design of (dis)similarity measures

ANOVA approach. Investigates interaction terms. Disadvantages: Requires careful sampling design with replication

The biogenesis-atbc2012 Training workshop "Evolutionary Approaches to Biodiversity Science" June 2012, Bonito, Brazil

Non-independence in Statistical Tests for Discrete Cross-species Data

Inconsistencies between theory and methodology: a recurrent problem in ordination studies.

Contextual analysis of ecological change for Tasmania

The Atlas Aspect of the Atlas of Living Australia

Description of Samples and Populations

What is the range of a taxon? A scaling problem at three levels: Spa9al scale Phylogene9c depth Time

Supplementary Materials for

DETECTING BIOLOGICAL AND ENVIRONMENTAL CHANGES: DESIGN AND ANALYSIS OF MONITORING AND EXPERIMENTS (University of Bologna, 3-14 March 2008)

Measured wind turbine loads and their dependence on inflow parameters

Lecture 2: Diversity, Distances, adonis. Lecture 2: Diversity, Distances, adonis. Alpha- Diversity. Alpha diversity definition(s)

BIOL 580 Analysis of Ecological Communities

15 March 2010 Re: Draft Native Vegetation of the Sydney Metropolitan Catchment Management Authority Area GIS layers and explanatory reports

Nonlinear Regression. Summary. Sample StatFolio: nonlinear reg.sgp

Multivariate Analysis of Ecological Data

Worksheet 4 - Multiple and nonlinear regression models

Vellend, Mark. Do commonly used indices of β-diversity measure species turnover?

Empirical Research Methods

2/7/2018. Strata. Strata

A Short Course in Basic Statistics

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

INTRODUCTION TO MULTIVARIATE ANALYSIS OF ECOLOGICAL DATA

BIO 682 Multivariate Statistics Spring 2008

BIOL 580 Analysis of Ecological Communities

Ridge Regression. Summary. Sample StatFolio: ridge reg.sgp. STATGRAPHICS Rev. 10/1/2014

Variables, distributions, and samples (cont.) Phil 12: Logic and Decision Making Fall 2010 UC San Diego 10/18/2010

Glossary for the Triola Statistics Series

Introduction to multivariate analysis Outline

Dan F. Rosauer, Simon Ferrier, Kristen J. Williams, Glenn Manion, J. Scott Keogh and Shawn W. Laffan

SUPPLEMENTARY INFORMATION

Introduction to Statistical Analysis using IBM SPSS Statistics (v24)

Meelis Kull Autumn Meelis Kull - Autumn MTAT Data Mining - Lecture 03

Inferring Ecological Processes from Taxonomic, Phylogenetic and Functional Trait b-diversity

From seafloor geomorphology to predictive habitat mapping: progress in applications of biophysical data to ocean management.

Module 4: Community structure and assembly

Principal Components Analysis. Sargur Srihari University at Buffalo

STAT 730 Chapter 14: Multidimensional scaling

Sociology 6Z03 Review I

LAB 3 INSTRUCTIONS SIMPLE LINEAR REGRESSION

Math, Stats, and Mathstats Review ECONOMETRICS (ECON 360) BEN VAN KAMMEN, PHD

SUPPLEMENTARY INFORMATION

Chapter 1 Ordination Methods and the Evaluation of Ediacaran Communities

Variety is the spice of river life: recognizing hydraulic diversity as a tool for managing flows in regulated rivers

4/4/2018. Stepwise model fitting. CCA with first three variables only Call: cca(formula = community ~ env1 + env2 + env3, data = envdata)

Diversity partitioning without statistical independence of alpha and beta

Linear Programming: Simplex

y response variable x 1, x 2,, x k -- a set of explanatory variables

Multidimensional scaling (MDS)

Do not copy, post, or distribute

Spatial Effects on Current and Future Climate of Ipomopsis aggregata Populations in Colorado Patterns of Precipitation and Maximum Temperature

Any of 27 linear and nonlinear models may be fit. The output parallels that of the Simple Regression procedure.

1. Exploratory Data Analysis

Depth-Duration Frequency (DDF) and Depth-Area- Reduction Factors (DARF)

Subject CS1 Actuarial Statistics 1 Core Principles

HOMEWORK ANALYSIS #2 - STOPPING DISTANCE

Representing species in reserves from patterns of assemblage diversity

Bivariate Data: Graphical Display The scatterplot is the basic tool for graphically displaying bivariate quantitative data.

Atmospheric Circulation and the Global Climate System A map-based exploration

Distance-based multivariate analyses confound location and dispersion effects

COASTAL QUATERNARY GEOLOGY MAPPING FOR NSW: EXAMPLES AND APPLICATIONS

Managing Uncertainty in Habitat Suitability Models. Jim Graham and Jake Nelson Oregon State University

2002 HSC Notes from the Marking Centre Geography

Biometrics Unit and Surveys. North Metro Area Office C West Broadway Forest Lake, Minnesota (651)

Phylogenetic beta diversity: linking ecological and evolutionary processes across space in time

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS

Analysis of Fast Input Selection: Application in Time Series Prediction

appstats27.notebook April 06, 2017

Geometric Algorithms in GIS

MCMC 2: Lecture 2 Coding and output. Phil O Neill Theo Kypraios School of Mathematical Sciences University of Nottingham

The discussion Analyzing beta diversity contains the following papers:

The Environmental Classification of Europe, a new tool for European landscape ecologists

Part III: Unstructured Data. Lecture timetable. Analysis of data. Data Retrieval: III.1 Unstructured data and data retrieval

Multiple Comparison Procedures Cohen Chapter 13. For EDUC/PSY 6600

Contents. Acknowledgments. xix

Natural Resource Management. Northern Tasmania. Strategy. Appendix 2

Using Predictive Modelling to Aid Planning in the Mineral Exploration and Mining Sector A Case Study using the Powelliphanta Land Snail

Remote sensing of biodiversity: measuring ecological complexity from space

Sparse regularization of NIR spectra using implicit spiking and derivative spectroscopy

CAP. Canonical Analysis of Principal coordinates. A computer program by Marti J. Anderson. Department of Statistics University of Auckland (2002)

FACTOR ANALYSIS AND MULTIDIMENSIONAL SCALING

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

Classification of Erosion Susceptibility

Multivariate Analysis of Ecological Data

Atmospheric Circulation and the Global Climate System A map-based exploration

Gradient types. Gradient Analysis. Gradient Gradient. Community Community. Gradients and landscape. Species responses

Mrs. Poyner/Mr. Page Chapter 3 page 1

Transcription:

Linking species-compositional dissimilarities and environmental data for biodiversity assessment D. P. Faith, S. Ferrier Australian Museum, 6 College St., Sydney, N.S.W. 2010, Australia; N.S.W. National Parks and Wildlife Service, PO Box 402 Armidale, NS.W. 2350, Australia. Abstract Effective biodiversity assessment makes best-possible use of available data to provide surrogate information for "all" of biodiversity and to allow comparisons among "all" areas. Environmental data is typically available for all areas but may be a weak biodiversity surrogate. Species records may be a good surrogate for other species, but typically are not available for all areas. These two forms of information can be combined by 1) summarising biotic information as compositional dissimilarities among areas, 2) linking observed dissimilarities to observed environmental differences, and then 3) using calculated environmental differences for new areas to assess biotic turnover. Approaches to this problem include ordination and dissimilarity/distance regression. Examples of both, using biodiversity data from Panama, highlight the advantages of using realistic models linking dissimilarity to environmental variation. While measured environmental variation explains Panama dissimilarities well, the ordination approach suggests that there are "missing" environmental variables that would increase explanation of biotic turnover among areas. Introduction Regional biodiversity assessment and planning depends on an understanding of patterns of species turnover from place to place. Quantifying such "beta-diversity" patterns provides information related to the number of species found in one area and not another, and so is fundamental to identification of representative sets of areas. One way to measure beta diversity is to calculate pairwise species-compositional dissimilarities among areas (for example, using the Bray Curtis measure; 1). Given that all species are never recorded for all areas, these dissimilarities typically are calculated only for selected taxa, and for a small subset of all areas. Even when the selected taxa are accepted as a surrogate for all species, limited distribution information for these taxa makes it impossible to directly compare all areas for biodiversity assessment. One alternative approach would be to use a form of surrogate information that typically is recorded for all areas - environmental data describing bioclimatic variables, lithology, or other factors. A better alternative, making best possible use of all the available data for biodiversity assessment, would combine available species and environmental information. Development of strategies for such a "best-possible" use of all available data remains a key challenge in biodiversity assessment (2, 3). Modeling species compositional dissimilarities (between areas) using environmental data may provide such a strategy, and may overcome the area-comparability problem. If calculated dissimilarities from available species data can be linked (predicted by) corresponding differences in environmental data, then all areas can be compared to one another in terms of predicted dissimilarities. Two possible pathways to this goal will be

compared and contrasted in this paper: i) ordinations (4, 5, 6), and ii) generalised dissimilarity modeling (GDM; 7, 8). i) Ordinations (configurations of areas in a multidimensional Euclidean space) provide a natural way to model beta diversity based on Bray Curtis or related dissimilarities (1). Under an assumption of general unimodal responses of species to underlying environmental gradients, a non-linear relationship between "environmental" distances from the ordination and species-based dissimilarities can be expected (1). Inference of ordination space from dissimilarities then can take advantage of this expected relationship (using Hybrid Multidimensional Scaling; 1, 2). The ordination space, based on the given set of dissimilarities, then can be linked to measured environmental variables by searching for projections through the space having maximum correlation with these variables. This regression approach allows new areas to be placed in the space based on their environmental values. Biodiversity assessment then may judge how well protected areas span the space, or how well a new area might fill a "gap" in the space (using the "environmental diversity" [ED] measure; 2). ii) Alternatively, regression may be applied directly to dissimilarities and environmental differences/distances among areas. A simple approach to directly modeling dissimilarities based on environmental distances is to use the values in a conventional linear regression. However, there are two limitations with this approach. One is that the expected non-linear relationship of dissimilarities and environmental distances is ignored. The other is that the rate of turnover in species composition is wrongly assumed constant along different parts of the same environmental gradient (9, 10, 11). Generalised dissimilarity modeling (GDM; 7, 8) generalises the linear dissimilarity/distance regression approach to accommodate both a nonlinear dissimilarity/distance relationship and a non-constant rate of turnover along gradients. This regression allows dissimilarities to be estimated for all pairs of areas. An ordination then can be calculated and representation levels and gaps assessed as in i). To explore properties of these two approaches, we will re-examine recently published biodiversity and environmental data from Panama. Duivenvoorden et al. (12) examined environmental and spatial explanations of beta diversity in Panama (see also Condit et al.; 13). Their study concluded that environmental data was a poor explanator of beta diversity, given that "59% of the variation in species dissimilarity remained unexplained by either distance or environment". Poor correspondence between environmental and beta diversity would limit the effectiveness of conservation strategies based on a predictive link between the two. However, it is possible that the poor explanation was simply a consequence of restricted assumptions of the analysis. Their approach assumed only a linear relationship between dissimilarity and distance, based on linear regression of dissimilarities in species composition against environmental and spatial distances between pairs of areas. We will consider realistic alternative models applied to these same data. We will first document the non-linear relationships between dissimilarity and distance for the Panama data, using ordination. Then we will apply GDM to explore the prediction of dissimilarity from environmental (and spatial) distance. Lastly, we will return to the ordination context to seek evidence for unmeasured environmental variables that might have further increased explanation. Ordination of Panama species data

The expected relationship between dissimilarity and distance depends on the relationship between species abundances and change in environmental variables (forming "gradients"). A standard model of unimodal species' responses to environmental gradients (4, 5, 6) implies a non-linear relationship between dissimilarity and distance (3). An ordination of areas can be inferred from Bray Curtis (or related) dissimilarities, based on a form of multidimensional scaling that matches this model. Hybrid Multidimensional Scaling (HMDS; 1) allows a linear relationship for small dissimilarities, and a monotonic relationship overall. The ordination method uses the species-based dissimilarities to derive a configuration of the areas, in a nominated number of dimensions, such that the dissimilarities have a strong, but non-linear, fit to distances among areas. For the Panama data (described in detail in 12 and http://www.sciencemag.org/cgi/content/full/295/5555/666/dc1, we used PATN software (14) to calculate an HMDS ordination. Ordination/environmental distances initially were calculated as Euclidean distances in a 2-dimensional space. The resulting ordination analysis illustrates the typical non-linear relationship between "environmental" distances from the ordination and species-based dissimilarities (Fig. 1). Fig. 1. The stress value (0.211) indicates degree of fit between dissimilarities and distances. The 2-dimensional configuration of areas is not shown, and is available from the authors on request. This ordination links with measured environmental values for areas. Correlations of ordination projections with these measured environmental variables describe the link from environmental variation to ordination pattern (Table 1).

Table 1. Variables (row) order is east-west index, north-south index, rainfall, altitude, age (for description of variables, see http://www.sciencemag.org/cgi/content/full/295/5555/666/dc1). The first two columns indicate the projection along each axis of the vector of maximum product-moment correlation for each variable (coefficients greater than 0.5 highlighted in red). Column three is the corresponding correlation value. We cannot calculate significance of correlation values in the usual way given the lack of independence of data points. We take these values as indicating relative degrees of correlation. These correlation values illustrate the capacity of ordinations to link dissimilarities and environmental differences among areas. In practice, these results might serve as the basis for a formal predictive regression, so that new areas can be placed in the space allowing assessment of all areas. While correlations show correspondence between ordination and environmental variation, we have not quantified the degree to which environment accounts for beta diversity when expressed as dissimilarities. To do this, we turn to GDMs. Generalised dissimilarity modeling Application of GDM to the Panama data reveals a strong, non-linear, fit of dissimilarities to environmental/spatial distances (Fig. 2). Fig. 2. The scatterplot depicts the relationship between environmental/spatial distance, as predicted by the model (a non-linear function of precipitation, elevation, age, geology, and geographical separation of sites), and observed species-based Bray-Curtis dissimilarity. The curved line represents the non-linear link function,, relating dissimilarity to environmental/spatial distance.

Duivenvoorden et al. reported a high value of 59% (12) of total variation left unexplained by space or environment. The non-linear fit from the GDM, by contrast, provides a much better explanation of beta diversity based on measured environmental/spatial variables. Only 15.6% of variation in Panamanian species dissimilarity now is unaccounted for (Fig. 3). Fig. 3. The pie-chart partitions the variance in observed dissimilarities explained by the GDM model. The "space" component is the variance explained by geographical separation after controlling for environment, while the "environment" component is the variance explained by environmental variables after controlling for geographical separation. The large "space and environment" component is the variance shared by environment and geographical separation, due to the correlation between these variables. The full model (incorporating both environment and geographical space) explains 84.4% of the variance in observed Bray-Curtis dissimilarity values (compared with the 41% achieved by Duivenvoorden et al.). Missing environmental variables These non-linear dissimilarity/distance models also may point to variation corresponding to one or more unmeasured environmental variables. Our ordination of the Panama species data, extended to six dimensions, implicates additional, unmeasured, environmental variation (Table 2). Table 2. The order of environmental/spatial variables (rows) is as in Table 1. The first six columns indicate orientation along each axis of the vector of maximum correlation for each variable (coefficients > 0.5 in red). Column seven is the corresponding product-moment correlation value. While projections along axes 1, 5 and 6 account for measured environmental/spatial variables, axes 2, 3 and 4 do not correspond well to these measured variables (the unordered

variable, geology, also showed no strong pattern along these axes). The 6-dimensional ordination therefore appears to have achieved a much-improved explanation of beta diversity (stress value lowered from 0.211 to 0.091) by revealing species' responses to other, unmeasured, environmental factors. A possible alternative explanation, however, is that any higher dimensional ordination might produce such a result - the stress value is always lower for a greater number of dimensions. We therefore evaluated the results for our 6-dimensional ordination in two ways. First, we examined the change in stress with additional dimensions starting from a 2-dimensional solution, and found no evidence of a sudden plateau effect (Fig. 4). Any sudden plateau effect - little reduction in stress beyond some number of dimensions - would suggest that 6 dimensions is not justified by these data. Fig. 4. Change in stress value (vertical axis) with number of dimensions (horizontal axis). Support for the 6-dimensional solution also might be argued from the consequent higher correlation values (relative to the 2-dimensional solution) for the measured environmental/spatial variables (Table 2). However, adding even random extra dimensions (say, to the initial 2-dimensional solution) might produce high correlations, because the vector of maximum correlation may take advantage of chance variation along these additional dimensions. To test this, we carried out a randomization test for one of the environmental variables (altitude). We fixed the 2-dimensional solution and repeatedly added four additional randomized dimensions, in each case re-calculating the maximum correlation vector/value for altitude. None of the new correlation values were as great as the observed correlation for 6 dimensions (Fig. 5).

Fig. 5. In the histogram, the actual stress value for 6 dimensions (black bar) is much lower than any of the stress values for the simulations (gray bars). We conclude that the 6-dimensional solution reveals patterns of beta-diversity in Panama that may be explained by additional, currrently-unmeasured, environmental variables. Discussion Species turnover in Panama is explained well by a combination of measured and unmeasured environmental variation. These results support the idea that realistic non-linear dissimilarity/distance models can account for beta diversity patterns. The results highlight the prospects for using available environmental data in combination with available species' survey data (e.g. from museum collections), in a way that overcomes the weaknesses of using either information on its own. However, more work is needed on linking these models to actual biodiversity planning. One candidate approach is the "ED" method (2) referred to in the introduction. ED uses dissimilarities among areas (often summarised as an ordination) to calculate complementarity values - gains and losses in represented biodiversity as sets of protected areas change (2). A form of ED has been used in N.S.W., Australia, as part of a framework for balancing biodiversity and other land use needs (15). However, in that study dissimilarities among all areas were obtained by relying only on the use of environmental data to derive an ordination space. Similarly, a variant of ED, using environmental data, has been used to select new survey sites in N.S.W. In this application, the expectation is that ED-based gaps indicate areas offering many new species. For both land-use planning and survey design, improvements in estimating biodiversity gains can be expected if the extensive Museum collection data for N.S.W. is linked to the environmental data, using the non-linear methods described here. Returning to the Panama context, biodiversity planning can now take advantage of the highly predictive relationship found in this study between species compositional dissimilarities and

environmental/spatial distances. A next step would be to calculate an ordination space based on estimated dissimilarities among all areas. Future planning analyses in Panama then may identify cost-effective conservation options filling gaps in ordination space. References 1. D. P. Faith, P. R. Minchin, L. Belbin, Vegetatio 69, 57 (1987). 2. D. P. Faith, P. A. Walker, Biod. Conserv. 5, 399 (1996). 3. D. P. Faith, P. A. Walker, Biod. Lett. 3, 18 (1996). 4. M. P. Austin, Vegetatio 69, 35 (1987). 5. H. G. Gauch, Jr., Multivariate Analysis and Community Structure. Cambridge University Press, Cambridge (1982). 6. R. H. Whittaker, Biol. Rev. 42, 207 (1967). 7. S. Ferrier, Syst. Biol. 51, 331 (2002). 8. S. Ferrier, M. Drielsma, G. Manion, G. Watson, Biod. Conserv. (in press). 9. M. T. Simmons, R. M. Cowling, Biod. Conserv. 5, 551 (1996). 10. M. V. Wilson, C. L. Mohler, Vegetatio 54, 129 (1983). 11. J. Oksanen, T. Tonteri, J. Veg. Sci. 6, 815 (1995). 12. J. F. Duivenvoorden, J. C. Svenning, S. J. Wright, Science 295, 636 (2002). 13. R. Condit et al., Science 295, (2002). 14. L. Belbin, PATN software package. CSIRO, Canberra (1994). 15. D. P. Faith, P. A. Walker, J. Ive, L. Belbin, For. Ecol. Manag. 85, 251 (1996).