Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

Similar documents
Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

Biometrika Trust. Biometrika Trust is collaborating with JSTOR to digitize, preserve and extend access to Biometrika.

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

Ecological Society of America is collaborating with JSTOR to digitize, preserve and extend access to Ecology.

Biometrika Trust. Biometrika Trust is collaborating with JSTOR to digitize, preserve and extend access to Biometrika.

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

Mind Association. Oxford University Press and Mind Association are collaborating with JSTOR to digitize, preserve and extend access to Mind.

-Principal components analysis is by far the oldest multivariate technique, dating back to the early 1900's; ecologists have used PCA since the

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

Detection of Influential Observation in Linear Regression. R. Dennis Cook. Technometrics, Vol. 19, No. 1. (Feb., 1977), pp

An Introduction to Ordination Connie Clark

Bootstrapped ordination: a method for estimating sampling effects in indirect gradient analysis

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

BIO 682 Multivariate Statistics Spring 2008

The Econometric Society is collaborating with JSTOR to digitize, preserve and extend access to Econometrica.

Journal of Applied Probability, Vol. 13, No. 3. (Sep., 1976), pp

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

The American Mathematical Monthly, Vol. 100, No. 8. (Oct., 1993), pp

The College Mathematics Journal, Vol. 24, No. 4. (Sep., 1993), pp

The College Mathematics Journal, Vol. 16, No. 2. (Mar., 1985), pp

The Periodogram and its Optical Analogy.

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

Journal of the American Mathematical Society, Vol. 2, No. 2. (Apr., 1989), pp

Linking species-compositional dissimilarities and environmental data for biodiversity assessment

Biometrika Trust. Biometrika Trust is collaborating with JSTOR to digitize, preserve and extend access to Biometrika.

ILLUSTRATIVE EXAMPLES OF PRINCIPAL COMPONENTS ANALYSIS

Mathematics of Computation, Vol. 17, No. 83. (Jul., 1963), pp

A Note on a Method for the Analysis of Significances en masse. Paul Seeger. Technometrics, Vol. 10, No. 3. (Aug., 1968), pp

Intelligent Data Analysis. Principal Component Analysis. School of Computer Science University of Birmingham

Mathematics of Operations Research, Vol. 2, No. 2. (May, 1977), pp

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

Multivariate Statistics 101. Ordination (PCA, NMDS, CA) Cluster Analysis (UPGMA, Ward s) Canonical Correspondence Analysis

Philosophy of Science Association

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

American Journal of Mathematics, Vol. 109, No. 1. (Feb., 1987), pp

Ordination & PCA. Ordination. Ordination

Introduction to ordination. Gary Bradfield Botany Dept.

Experimental designs for multiple responses with different models

Design possibilities for impact noise insulation in lightweight floors A parameter study

Philosophy of Science, Vol. 43, No. 4. (Dec., 1976), pp

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

DETECTING BIOLOGICAL AND ENVIRONMENTAL CHANGES: DESIGN AND ANALYSIS OF MONITORING AND EXPERIMENTS (University of Bologna, 3-14 March 2008)

The Limiting Similarity, Convergence, and Divergence of Coexisting Species

Exploratory Factor Analysis and Principal Component Analysis

Introduction to multivariate analysis Outline

w. T. Federer, z. D. Feng and c. E. McCulloch

Discriminative Direction for Kernel Classifiers

Compositional dissimilarity as a robust measure of ecological distance

Lecture 2: Diversity, Distances, adonis. Lecture 2: Diversity, Distances, adonis. Alpha- Diversity. Alpha diversity definition(s)

Multivariate Analysis of Ecological Data

Exploratory Factor Analysis and Principal Component Analysis

An Improved Approximate Formula for Calculating Sample Sizes for Comparing Two Binomial Distributions

Annals of Mathematics

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

Chained Versus Post-Stratification Equating in a Linear Context: An Evaluation Using Empirical Data

Statistical Concepts. Constructing a Trend Plot

International Biometric Society is collaborating with JSTOR to digitize, preserve and extend access to Biometrics.

The College Mathematics Journal, Vol. 21, No. 4. (Sep., 1990), pp

Variance of Lipschitz Functions and an Isoperimetric Problem for a Class of Product Measures

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

Principal Component Analysis, an Aid to Interpretation of Data. A Case Study of Oil Palm (Elaeis guineensis Jacq.)

Proceedings of the American Mathematical Society, Vol. 88, No. 1. (May, 1983), pp

American Society for Quality

Principal Component Analysis

Treatment of Error in Experimental Measurements

Proceedings of the American Mathematical Society, Vol. 17, No. 5. (Oct., 1966), pp

Quantitative Understanding in Biology Principal Components Analysis

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

STRESSES WITHIN CURVED LAMINATED BEAMS OF DOUGLAS-FIR

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

Quantitative Genomics and Genetics BTRY 4830/6830; PBSB

4/2/2018. Canonical Analyses Analysis aimed at identifying the relationship between two multivariate datasets. Cannonical Correlation.

The American Mathematical Monthly, Vol. 104, No. 8. (Oct., 1997), pp

20 Unsupervised Learning and Principal Components Analysis (PCA)

Dimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes. October 3, Statistics 202: Data Mining

Lack-of-fit Tests to Indicate Material Model Improvement or Experimental Data Noise Reduction

Multivariate Data Analysis a survey of data reduction and data association techniques: Principal Components Analysis

Lecture 5: Ecological distance metrics; Principal Coordinates Analysis. Univariate testing vs. community analysis

1.2. Correspondence analysis. Pierre Legendre Département de sciences biologiques Université de Montréal

Unconstrained Ordination

ENGINEERING MECHANICS

Lecture 5: Ecological distance metrics; Principal Coordinates Analysis. Univariate testing vs. community analysis

Statistics for Social and Behavioral Sciences

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

Proceedings of the American Mathematical Society, Vol. 57, No. 2. (Jun., 1976), pp

Principal Component Analysis, A Powerful Scoring Technique

Inconsistencies between theory and methodology: a recurrent problem in ordination studies.

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

Dissimilarity and transformations. Pierre Legendre Département de sciences biologiques Université de Montréal

INFORMS is collaborating with JSTOR to digitize, preserve and extend access to Mathematics of Operations Research.

The Econometric Society is collaborating with JSTOR to digitize, preserve and extend access to Econometrica.

Transcription:

Effects of Sample Distribution along Gradients on Eigenvector Ordination Author(s): C. L. Mohler Source: Vegetatio, Vol. 45, No. 3 (Jul. 31, 1981), pp. 141-145 Published by: Springer Stable URL: http://www.jstor.org/stable/20037040. Accessed: 04/03/2011 11:23 Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at. http://www.jstor.org/page/info/about/policies/terms.jsp. JSTOR's Terms and Conditions of Use provides, in part, that unless you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in the JSTOR archive only for your personal, non-commercial use. Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at. http://www.jstor.org/action/showpublisher?publishercode=springer.. Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission. JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org. Springer is collaborating with JSTOR to digitize, preserve and extend access to Vegetatio. http://www.jstor.org

Effects of sample distribution along gradients on eigenvector ordination C. L. Mohler* Section of Ecology and Systematics, Cornell University, Ithaca, NY 14850, USA Keywords: Correspondence analysis, Detrended correspondence analysis, Eigenvector ordination, Gradient analysis, Principal components analysis, Sample distribution, Stratified sampling Abstract In general, disproportionately heavy sampling of the ends of a gradient increases the interpretability of eigenvector ordinations. More specifically, correspondence analysis (CA) and detrended correspondence analysis (DCA) best reproduce the original positions of samples in simulated coenoclines when samples are clustered toward the ends of the axis. Principal components analysis (PCA) reproduces the original sample positions less well than either CA or DCA and shows no improvement as samples are increasingly clustered toward the ends of the axis. PCA and CA show less curvature of one dimensional data into the second axis when sampling favors the ends of the axis. Introduction Ordination is often used to discover and elucidate major axes of compositional variation in vegetation data. In general, however, most phytosociologists have at least a rough idea of the variety of vegetation in an area even before formal sampling. When this is the case it is possible to stratify sampling with respect to the dominant axis (axes) of variation. This paper explores some aspects of pattern of sample stratification on performance of ordination techniques. The somewhat complemen tary problem of determining compositional distance between samples on an axis will be dealt with elsewhere. This study was prompted by the discovery that disproportionately heavy sampling of the ends of a gradient allows greater accuracy in *I thank Mark V. Wilson, Peter L. Marks, Hugh G. Gauch, the late R. H. Whittaker, E. van der Maarel, and several anonymous reviewers for helpful comments on the manuscript, and Monica Howland for preparing the figures. This work was supported by Mclntire-Stennis Grant No. 183-7551 and a grant from the National Park Service, both to Peter L. Marks of the Section of Ecology and Systematics at Cornell University. plotting the response of species abundance against environmental variables and ordination axes. There are a great variety of ordination techniques and the effect of sample arrangement on ordination will depend on the technique used. First, as Gauch et al ( 1977) indicate, sample dispersion can have no effect on Bray-Curtis polar ordination (Bray & Curtis 1957) because the technique arranges samples only with respect to selected end points. Second, in weighted average ordination (Ellenberg, 1948; Whittaker, 1956) sample positions are assigned only on the basis of species weights (and vice versa) so that clustering the samples can have no effect on their relative positions in the ordination. Third, Gaussian ordination (Gauch et al, 1974; Ihm & van Groenewoud, 1975) should be improved by heavy sampling toward the ends of the gradient since such a dispersion of samples gives maximum informa tion about the more poorly defined curves. The most common and generally useful techniques, however, are principal components analysis (PCA) and correspondence analysis (CA, also called reciprocal averaging). Gauch et al (1977) find some changes in both PCA and CA Vegetatio 45, 141-145 (1981). 0042-3106/81/0453-0141 $1.00.? Dr. W. Junk Publishers, The Hague. Printed in The Netherlands.

142 when clusters of samples are added to a coenocline sampled at regular intervals. Although their dis cussion is brief, apparently CA is only slightly affected whereas PCA is strongly and unpredict ably affected. PCA and CA ordinations of coenoclines generally show 'arch distortion' (Gauch et al, 1977) in which samples (or species) lying originally on a single axis are displaced into higher axes. Although this is considered an inconvenience in arranging data it is a logical consequence of the nature of the tech nique: samples (species) with low first axis loadings necessarily differ from those with high loadings and the two sorts therefore fall at opposite ends of some higher axis. Accordingly, it seems reasonable to expect that bunching samples at the end should raise the percent of variance accounted for by the first axis and reduce the variance accounted for by the second and higher axes, which is to say, reduce the curvature into higher axes. Recently Hill & Gauch (1980) have introduced detrended correspondence analysis (DCA) in which curvature of the primary axes into higher axes is systematically eliminated and positions are ad justed to remove CA's tendency to bunch species and samples near the ends of axes. Two hypotheses based on the above considera tions \vere tested: (1) that deviation from regular sampling of a coenocline causes eigenvector ordina tions to distort first axis sample arrangements and (2) that concentration of samples near the ends of the gradient decreases curvature into the second axis (except in the case of DCA where such curvature Methods is secondarily removed). 2 2 _l?l_ 3 2 _l_l_ 2 3 3 3 3 3 2 _J_I_l_ 2 3 3 3 2 _l_i_i 2 2 i i i i i i _l_i_i_l_ 2 2 2 _J_I_I_I 2 3 3 _l_i_i 25 50 75 100 Coenocline N Index 1 23.8 1 30.5 1 40.0 Fig. 1. Sample dispersion patterns for ordination of simulated data. Coenocline index values are computed by the formula?m where AT is the axial position of the th sample, Xis the mean of X, and n is the number of samples. In all cases considered here n = 21 and X = 50, except the last where X = 48.33. The index runs from 0 (all samples at X) to 100 (half of the samples at 0 and half at 100). according to the eight patterns diagrammed in Figure 1. (Some coenoclines replicated certain sampling patterns.) Each data set was then ordinated using CA, DCA, species centered PCA, and species centered PCA with standardization. To compare the dispersion of samples on the original axis with the dispersion on the ordination axes I computed the index 2" I A 68.6 77.1 80.0 87.0 (1) I first created three replicates of a 19 species, 21 stand coenocline using the simulation algorithm of Gauch & Whittaker (1972) modified so that the expected standard deviation of a species' abundance at each point on the gradient is proportional to its mean abundance at that point. This modification produces data with a structure similar to that from a variety of gradient analysis studies (Mohler, 1979 and unpublished data). The coenocline measured 2.9 half changes using the technique of Wilson & Mohler (in press). I concatenated the three data sets in various ways to produce 11 coenoclines sampled where X- is the axial position of sample y, X is the mean of _?.., and n is the number of samples. The index runs from 0 (all samples at X) to 100 (half of the samples at 0 and half at 100). Thus, for example, if Id is smaller for an ordination than for the coenocline, the ordination is tending to shift samples toward the center of moment of the axis. For each sampling pattern I also computed the mean displacement of samples on ordination axes from their position on the original coenocline (Kessell & Whittaker, 1976). Mean sample dis placement is a measure of the overall lack of

143 Table 1. Dispersion pattern of coenoclines and ordinations, and mean displacement of samples on each ordination from their original positions on the corresponding coenocline. All coenocline and ordination axes were scaled from 0 to 100. Dispersion pattern is quantified by the index /?(see text). Note that some dispersion patterns are replicated. The PCA ordinations were species centered with standardization. i H Mean Sample Displacement Coenocline axis DCA axis 1 CA axis 1 PCA axis 1 DCA CA PCA Center favored Regular spacing Ends favored 23.8 30.5 40.0 68.6 68.6 77.1 80.0 87.0 19.9 28.2 35.5 47.5 45.1 47.7 62.0 63.6 73.2 76.0 87.0 14.7 25.0 34.7 51.4 48.0 51.2 68.2 69.1 79.1 81.6 89.7 14.0 18.3 22.3 30.7 35.5 31.6 46.4 47.1 58.6 61.9 67.0 5.0 3.6 4.1 3.4 5.6 2.8 4.8 3.6 2.6 2.7 1.8 9.7 12.8 8.9 6.3 7.7 5.7 4.8 4.7 3.8 3.6 6.3 14.9 7.9 9.2 11.0 9.3 10.6 11.7 11.6 10.2 10.0 12.9 congruence between the two sample arrangements whereas comparison of /?values indicates whether a lack of congruence is due to systematic shift of samples with respect to the center of moment of the axis. Results For CA and DCA the mean displacement of samples from their true position generally decreases as the samples become more clustered toward the ends (Table 1). The only notable exception to this trend is the slightly larger mean displacement for the most heavily end-weighted coenocline when using CA. Correlations between coenoclines and ordinations show almost exactly the same pattern. Placement of stands is consistently more accurate with DCA than with CA regardless of sampling scheme. In contrast to the two types of corre spondence analysis, the mean displacement for standardized PCA does not improve when samples are clustered toward the ends (Table 1). On the other hand, correlations between coenocline and PCA axis positions do show a general improvement when samples are clustered toward the ends. In general, misplacement of samples is worse for species centered PCA with standardization than for either CA or DCA. PCA without standardization shows involution in which the true end samples are displaced toward the center and samples at inter mediate positions are shifted outward. Although involution with unstandardized PCA occurs with all sampling schemes tested, clustering samples toward the ends decreases this sort of distortion. Thus, as samples go from highly bunched toward the center to highly concentrated near the ends, separation of coenocline end samples on the ordi nation axis increases from 30 units to 85 units. DCA reproduces the original dispersion pattern fairly well for all data sets tested (Table 1). Bunching samples toward the center of the gradient tends to produce CA ordinations in which the samples are even more centrally clustered than on the original axis, as indicated by Id values which are lower for the ordinations than for the original gradient (Table 1). On the other hand, Id values for CA are close to those for the original gradient for all sampling patterns which favor the gradient extremes (Table 1). PCA does not perform so well. Relative to the original dispersion, species centered PCA with standardization always tends to bunch samples toward the mean position (Table 1). Since nonstandardized PCA does not preserve sample sequence there is little point in computing /?values for this technique. With CA and the two PCA techniques, degree of curvature into the second axis is less when sampling

> 144 _i_i-1? 20 40 60 80 Fig. 2. Variance accounted for by an ordination versus sample dispersion. Ordinate is variance accounted for (% EV) by first and second correspondence analysis axes; abscissa is coenocline sample dispersion index for the coenocline (see text). The upper curve plots % EV for the first axis; the lower plots % EV for the second axis. Trend lines are fitted by eye. % EV values for the noiseless coenocline are represented by X. favors the ends. Percent of variance accounted for by the first CA axis rises from 44% for the most centrally clustered pattern to 75% for the pattern which most favors the extremes (Fig. 2). In a complementary fashion, percentage of variance accounted for by the second axis falls from 39% to 9% (Fig. 2). Both PCA procedures perform in a similar 80h 60 ^ 40 20 h Discussion manner. Eigenvector ordination seems to work best when sampling pattern favors the gradient extremes. CA and DCA reproduce the original sample positions more exactly when ends of the gradient are sampled more heavily than the center. Although DCA adjusts sample positions to correct for systematic displacement this does not necessarily correct for misplacement due to random variation in the data. Concentration of samples near the ends of the gradient through stratified sampling does seem to correct for some additional component of error, but the actual mechanism involved is unclear. As in previous evaluations (Chardy et al, 1976; Gauch et al, 1977), PCA proved less able to recover the original sample positions than CA. Recovery of original positions by PCA does not improve when samples are clustered toward the extremes; never theless, heavy sampling of the ends results in less involution and less curvature into the second axis and thus makes PCA ordinations more interpret able than with regular or centrally clustered sampling. Although I made no attempt to evaluate the effect of sampling pattern on ordinations of data with two or more axes major of variation, results of this study can be generalized to more complex data sets. First, from the previous discussion it seems reasonable to expect that whenever a factor is known to affect community composition, stratifi cation which favors the extremes ofthat factor will improve stand placement on CA or DCA axes which correlate highly with that factor. Increased accuracy in sample placement due to multiple axis stratification is likely to be greatest with DCA since the various axes will then lack the complicating distortions found in other eigenvector techniques. Second, with both CA and PCA, curvature of the first axis into higher axes complicates interpreta tion even when there are several major dimensions of variability in the data. Sample stratification which favors extremes of the first dimension should reduce this curvature. Furthermore, stratification of samples with respect to the second dimension of variation should reduce curvature of the second ordination axis into the third and so forth. Since stratification which favors the extremes loads additional variability into particular dimen sions it can influence which dimension emerges as the dominant axis. A change in the ranking of axes will be of little importance in most applications, however, since relative placement of samples within the ordination field will be largely independent of which axis appears with the largest eigenvalue. The consequences of this study for field workers are clear: in order to visualize variability in vegeta tion using eigenvector ordination one must capture variability in the sample. Accordingly, extreme and unusual environments should be strongly favored during sampling. In contrast, since intermediate environments tend to be more common than extreme ones within most landscapes, a haphazard or simple random sample is likely to result in

145 ordinations of low interpretability. I demonstrate elsewhere that stratified sampling which favors extreme communities also greatly improves estimates of species distribution parameters. References Bray, J. R. & Curtis, J. T., 1957. An ordination of the upland forest communities of southern Wisconsin. Ecol. Monogr. 27: 325-349. Chardy, P., Glemarec, M. & Lauree, A., 1976. Application of inertia methods to benthic marine ecology: practical impli cations of the basic options. Estuarine Coastal Mar. Sei. 4: 179-205. Ellenberg, H., 1948. Unkrautgesellschaften als Mass f?r den Sauregrad, die Verdichtung und andere Eigenschaften des Ackerbodens. Ber. Landtech. 4: 130-146. Gauch, H. G. Jr. & Whittaker, R. H., 1972. Coenocline simulation. Ecology 53: 446-451. Gauch, H. G. Jr., Chase, G. B. & Whittaker, R. H., 1974. Ordination of vegetation samples by Gaussian species dis tributions. Ecology 55: 1382-1390. Gauch, H. G. Jr., Whittaker, R. H. & Wentworth, T. R., 1977. A comparative study of reciprocal averaging and other ordina tion techniques. J. Ecol. 65: 157-174. Hill, M. O. & Gauch Jr., H. G., 1980. Detrended correspondence analysis, an improved ordination technique. Vegetatio 42: 47-59. Ihm, P. & Groenewoud, H. van, 1975. A multivariate ordering of vegetation data based on Gaussian type gradient response curves. J. Ecol. 63: 767-777. Kessell, S. R. & Whittaker, R. H., 1976. Comparisons of three ordination techniques. Vegetatio 32: 21-29. Mohler, C. L., 1979. An analysis of floodplain vegetation of the lower Neches drainage, southeast Texas, with some con siderations on the use of regression and correlation in plant synecology. Ph.D. Thesis, Cornell Univ., Ithaca, N.Y. Whittaker, R. H., 1956. Vegetation of the Great Smoky Moun tains. Ecol. Monogr. 26: 1-80. Wilson, M. V. & Mohler, C. L., In press. GRADBETA-a FORTRAN program for measuring compositional change along gradients. Ecology & Systematics, Cornell Univ., Ithaca, N.Y. 51 p. Accepted 5.1.1981.