Vegetation Analysis Gradient Analysis Slide 18 Vegetation Analysis Gradient Analysis Slide 19 Gradient Analysis Relation of species and environmental variables or gradients. Gradient Gradient Individualistic species responses. GradientAnalysis Bioindication Community Community Gradient types 1. Direct gradients: Influence organims but are not consumed. Correspond to conditions. 2. Resource gradients: Consumed Correspond to resources. Complex gradients. Covarying direct and/or resource gradients: Impossible to separate effects of single gradients. Most observed gradients. Vegetation Analysis Gradient Analysis Slide 20 Gradients and landscape Vegetation Analysis Gradient Analysis Slide 21 Species responses D C Species have non-linear responses along gradients. Mt Field Landscape Gradient space A B A B B D C D
Vegetation Analysis Gradient Analysis Slide 22 Linear models are inadequate Vegetation Analysis Gradient Analysis Slide 23 Gaussian response The slope, sign and significance depend on the studied range on the gradient 0 2 4 6 8 r = 0.836 0 2 4 6 8 r =! 0.036 0 2 4 6 8 r =! 0.848 ) (x u)2 µ = h exp ( 2t 2 Three interpretable parameters: 1. Location of optimum u on gradient x 2. Expected height h at the optimum 3. Width t of the response BAUERUBI h µ = h! exp " % u)2& $%(x ( # 2t 2 ' t u Vegetation Analysis Gradient Analysis Slide 24 Dream of species packing Vegetation Analysis Gradient Analysis Slide 25 Species have Gaussian responses and divide the gradient optimally: Equal heights h. Equal widths t. Evenly distributed optima u. Evidence for Gaussian response Whittaker described many response types: multimodal, skewed, flat, plateaux and symmetric. Only a small part of responses were regarded as symmetric, still became the standard. 0 1 2 3 4 5 6 First canonized in coenocline simulations. Species packing is the theoretical basis of (canonical) correspondence analysis. Gradient
Vegetation Analysis Gradient Analysis Slide 26 Weighted averages Vegetation Analysis Gradient Analysis Slide 27 Bias and truncation Weights: Species abundances y. Gives u as the average on x. Presence absence data: the average of site values where species occurs. Quantitative data: more weight to sites where species is more abundant. Symmetric: Species optima u to estimate gradient values x y ũ j = range N i=1 yijxi N i=1 yij WA x Weighted averages are good estimates of Gaussian optima, unless the response is truncated. Bias towards the gradient centre: shrinking. Gradient Vegetation Analysis Gradient Analysis Slide 28 Popular response models Gaussian response model: The most popular model that gives symmetric responses, and is the basis of much of theory of ordination and gradient analysis. Vegetation Analysis Gradient Analysis Slide 29 Shape matters Fundamental response can be symmetric, but realized response skewed or multimodal due to species interactions. Beta response: Able to produce responses of varying skewness and kurtosis, and challenges the Gaussian dominance. HOF response: A family of hierarchic models which can be produce skewed, symmetric or different monotone responses, and can be used to analyse the response shape. GAM models: Can find any smooth shape and fit any kind of smooth response. vaste Gradientti
Vegetation Analysis Gradient Analysis Slide 30 Real World (almost) Vegetation Analysis Gradient Analysis Slide 31 Gaussian response: a case of GLM In Danish beech forests, dominant species skew other species away Austin predicted skewed responses at gradient ends A decent gradient would be nice instead of a dca axis... 0.0 0.2 0.4 0.6 0.0 0.1 0.2 0.3 0.4 Trees FRAXEXC0 FAGUSYL0 ACERPSE0 QUERROB0 ULMUGLA0 TILICOR0 ALNUGLU0 CARPBET0 ACERCAM0 TILIPLA0 POPUTRE0 BETUPUB0 PRUUPAD0 PRUUAVI0 ACERPLA0 BETUPEN0 MALUSYL0 PINUSYL0 0 2 4 6 8 DCA 1 Shrubs RUBUIDA0 RUBUCAE0 CORLAVE0 SAMBNIG0 VIBUOPU0 RIBERUB0 JUNICOM0 RIBEUVA0 RIBEALP0 ILEXAQU0 Can be reparametrized as a generalized linear model: Gradient as a 2 nd degree polynomial. Logarithmic link function. u = b 1 2b 2 t = 1 2b 2 ( ) h = exp b 0 b2 1 4b 2 BAUERUBI (x u)2 µ = h exp 2t 2 log(µ) = b 0 b 1 x b 2 x 2 u h µ = h! exp " % u)2& $%(x ( # 2t 2 ' t 0 2 4 6 8 DCA 1 Vegetation Analysis Gradient Analysis Slide 32 Vegetation Analysis Gradient Analysis Slide 33 Generalized linear models: a refresher Special cases of GLM 1. Linear predictor η: a linear function of explanatory variables, which can be continuous or classes, and can be transformed variables, or powers or polynomials η = b 0 b 1 x 1 b 2 x 2 b p x p Model Link Error Variance Linear model Identity µ = η Normal Constant Log-linear Logarithmic Poisson µ Logistic Logistic Binomial µ(1 π) 2. Link function g( ) that transforms the fitted values µ to the linear predictor η g(µ) = η 3. Error distribution from the exponential family to describe the distribution of residuals about fitted values. 3 2 * x! 6 * x^2!200!150!100!50 0 Identity, polynomial!2 0 2 4 6 x exp(3 2 * x! 6 * x^2) 0 5 10 15 20 Log link, polynomial!2 0 2 4 6 x plogis(!1 3 * x) Logit, linear!2 0 2 4 6 x!1 5 * log(x)!10!5 0 5 10 Linear on log(x) 0 2 4 6 x
Vegetation Analysis Gradient Analysis Slide 34 Ecologically meaningful error distributions Normal error rarely adequate in ecology, but GLM offer ecologically meaningful alternatives. Poisson. Counts: integers, non-negative, variance increases with mean. Binomial. Observed proportions from a total: integers, non-negative, have a maximum value, variance largest at π = 0.5 Gamma. Concentrations: non-negative real values, standard deviation increases with mean, many near-zero values and some high peaks. Vegetation Analysis Gradient Analysis Slide 35 Goodness of fit and inference Deviance: Measure of goodness of fit Derived from the error function: Residual sum of squares in Normal error Distributed approximately like χ 2 Residual degrees of freedom: Each fitted parameter consumes one degree of freedom and (probably) reduces the deviance. Inference: Compare change in deviance against change in degrees of freedom Overdispersion: Deviance larger than expected under strict likelihood model Use F statistic in place of χ 2. Vegetation Analysis Gradient Analysis Slide 36 Gaussian model and response range Vegetation Analysis Gradient Analysis Slide 37 Several gradients Gaussian response is never exactly zero: Asymptotic model Observed abundances have a discrete component The observed range depends on parameters t and h Gaussian response can be fitted to several gradients: Bell 0 50 100 150 200 250 Gradient
Vegetation Analysis Gradient Analysis Slide 38 Interactions in Gaussian responses Vegetation Analysis Gradient Analysis Slide 39 Logistic Gaussian response No interactions: s parallel to the gradients Interactions: The optimum on one gradient depends on the other 0 50 100 150 200 250 0 50 100 150 200 250 Polynomial often used with other link functions than log. Binomial error: logistic link. The Gaussian parameters correct only with log link: Width t has different interpretation. Probability h t u (m) Vegetation Analysis Gradient Analysis Slide 40 Beta response s with varying skewness and kurtosis. Simulated coenoclines to test robustness of ordination. Commonly fitted fixing endpoints p 1 and p 2 and using GLM: Not flexible any longer, but greatly influenced by endpoints. Must be fitted with non-linear regression. Probability µ = k(x p 1 ) α (p 2 x) γ (m) Fix 10m Fix 50m Free Vegetation Analysis Gradient Analysis Slide 41 Parameters of Beta response No clearly interpreted parameters. α and γ define: 1. The location of the mode. 2. The skewness of the response. 3. The kurtosis of the response. µ = k(x p 1 ) α (p 2 x) γ is zero at p 1 and p 2 : absolute endpoints of the range. k is a scaling parameter: height depends on other parameters as well.
Vegetation Analysis Gradient Analysis Slide 42 Vegetation Analysis Gradient Analysis Slide 43 HOF models Huisman Olff Fresco: A set of five hierarchic models with different shapes. Model Parameters V Skewed a b c d IV Symmetric a b c b III Plateau a b c II Monotone a b 0 0 I Flat a 0 0 0 Probability µ = M [1exp(abx)] [1exp(c dx)] III II IV V (m) HOF: Inference on response shape Alternative models differ only in response shape. Selection of parsimonous model with statistical criteria. Shape is a parametric concept, and parametric HOF models may be the best way of analysing differences in response shapes. Frequency 0 5 10 15 20 I II III IV V HOF model Most parsimonous HOF models on gradient in Mt. Field, Tasmania. Vegetation Analysis Gradient Analysis Slide 44 Generalized Additive Models (GAM) Vegetation Analysis Gradient Analysis Slide 45 Degrees of Freedom Generalized from GLM: linear predictor replaced with smooth predictor. Smoothing by regression splines or other smoothers. Degree of smoothing controlled by degrees of freedom: analogous to number of parameters in GLM. Everything else like in GLM. Enormous use in ecology also outside gradient modelling. Probability g(µ) = smooth(x) (m) EPACSERP The width of a smoothing window = Degrees of Freedom EPACSERP
Vegetation Analysis Gradient Analysis Slide 46 Linear scale and response scale Vegetation Analysis Gradient Analysis Slide 47 Multiple gradients GAM is smooth in the link scale, but the user prefers the response POA.GUNN 1000 1200 s(,7.15)!10!5 0 1000 1200 Each gradient is fitted separately Interpretation easy: Only the individual main effects shown and analysed Possible to select good parametric shapes Thin-plate splines: Same smoothness in all directions and no attempt of making responses parallel to axes s(,3.89) s(,2.04)!3!1 0 1!3!1 0 1 0 50 100 150 200 250 Vegetation Analysis Gradient Analysis Slide 48 Vegetation Analysis Gradient Analysis Slide 49 Interactions GAM are designed to show the main effects beautifully in panel plots Equivalent kernel is parallel to the axes Diversity and spatial scale Truth GAM Whittaker suggested several concepts of diversity 0 50 100 200 0 50 100 200 α: Diversity on a sample plot, or point diversity. β: Diversity along ecological gradients. γ: Diversity among parallel gradients or classes of environmental variables. δ: The total diversity of a landscape: sum of all previous.
Vegetation Analysis Gradient Analysis Slide 50 Vegetation Analysis Gradient Analysis Slide 51 General heterogeneity Many faces of beta diversity What are we talking about when we are talking about beta diversity? 1. General heterogeneity of a community. 2. Decay of similarity with gradient separation. 3. Widths of species responses along gradients. 4. Rate of change in community composition along gradients. Whittaker s index : Proportion of average species richness on a single plot S and thet total species richness in all plots S TOT. Total richness increases with increasing sample size. Average richness stabilizes with increasing sampling effort. S/S TOT decreases with sample size. No reference to gradients: even with a single location, replicate sampling decreases the index. Pattern diversity: Within site diversity. Vegetation Analysis Gradient Analysis Slide 52 Similarity decay with gradient separation Vegetation Analysis Gradient Analysis Slide 53 Hill indices of beta diversity Plot community (dis)similarity against gradient separation and fit Intercept: (Dis)similarity at a linear regression. zero-distance noise, replicate (dis)similarity, general heterogeneity or pattern diversity. Slope: Beta diversity. Half-change: Gradient ditance where expected similarity is half of the replicate similarity (intercept). Community dissimilarity Threshold Half!change Replicate dissimilarity 0 100 200 300 400 separation (m) 1. Average width of species responses. 2. Variance of optima of species occurring in one site. Used with scaling of ordination axes. The first index discussed and described, but the second applied. Equal only to degenerated species packing gradients. Mt.Field, Good drainage, site K05 Gaussian responses fitted to species occurring in one site.
Vegetation Analysis Gradient Analysis Slide 54 Vegetation Analysis Gradient Analysis Slide 55 Hill rescaling of gradients Hill scaling in practice Hill index spaced on species occurrences in sites: random variation. Hill 1 110 130 150 Hill 2 10000 30000 50000 Smoothed by segments. (m) (m) Each segment made equally long in terms of the Hill index almost... Four cycles commonly performed, but not enough to stabilize the Hill index (with half steps taken). Hill 1 0.6 0.8 1.0 1.2 Hill 2 0.5 1.0 1.5 2.0 2.5 0.0 0.5 1.0 1.5 2.0 0.0 0.5 1.0 1.5 2.0, Hill scaled, Hill scaled Vegetation Analysis Gradient Analysis Slide 56 Vegetation Analysis Gradient Analysis Slide 57 Are there species in common at 4sd distance? Confounds Normal probability density and Gaussian response: Density had 95 % of its survace at µ ± 2σ, but the height of the response is 0.135h The range of species depends on h, but in many cases a more realistic limit is u ± 3t, where µ = 0.01h If widths t vary, some species occur at longer distances. Look at your data before saying that there are no species in common at 4 sd. (µ) Rate of Change "µ/"x 0 1 2 3 4 5 6 Rate of change along gradients Instantaneous rate of change δ at any gradient point x estimated from fitted species response functions µ: HOF V M µ = (1 exp(a bx))(1 exp(c! dx)) HOF V (µ) Rate of Change "µ/"x 0 1 2 3 4 5 6 HOF IV M µ = (1 exp(a bx))(1 exp(c! bx)) HOF IV (µ) Rate of Change "µ/"x 0 1 2 3 4 5 6 M µ = (1 exp(a bx))(1 exp(c)) HOF III HOF III (µ) Rate of Change "µ/"x 0 1 2 3 4 5 6 HOF II M µ = 1 exp(a bx) HOF II δ(x) = S x µ j(x) j=1
Vegetation Analysis Gradient Analysis Slide 58 Rescaling to constant rate of change Vegetation Analysis Gradient Analysis Slide 59 Alternative rescaling and response shapes Make interval between any two gradient points a and b equal to the total accumulated change ab between points: ab = b a δ(x) dx Can be based on any response model: The example uses HOF. Beta Diversity Beta Diversity 0 5 10 15 20 25 30 0.0 0.5 1.0 1.5 2.0 2.5 0.0 0.1 0.2 0.3 0.4 Alkalinity 2 3 0 1 Direct rescaling and Hill rescaling are inconsistent. Two Hill indices of beta diversity are inconsistent. None of the rescaling methods produce symmetric response shapes. Ordination axes tend to produce symmetric responses. MDS CA!resc CA Hill Resc I II III IV V 0 10 20 30 40 50 Rescaled Alkalinity Vegetation Analysis Gradient Analysis Slide 60 Vegetation Analysis Gradient Analysis Slide 61 Weighted averages in bioindication Deshrinking: stretch weighted averages x i = S j=1 y iju j S j=1 y ij Weighted average of indicator values of species occuring in a site. Can use species weighted averages ũ j or other indicator values u j. Repeated cycling x ũ, ũ x,..., x ũ gives a solution of first axis in correspondence analysis. The range and variance of weighted averages is smaller than the range of values they are based on: deshrinking to restore the original variance. 1. Inverse regression: regress gradient values on WAs. 2. Classical regression: regress WAs on gradient values. 3. Simple stretching: make variances equal. 1:1 5.2 5.4 5.6 5.8 6.0 6.2 6.4 Weighted average
Vegetation Analysis Gradient Analysis Slide 62 Goodness of prediction: Bias and error Vegetation Analysis Gradient Analysis Slide 63 Cross validation Goodness: prediction error. Correlation bad: depends on the range of observations. Leave-one-out ( jackknife ), each in turn, or divide data into training and test data sets. Root mean squared error ɛ = N i=1 ( x i x i ) 2 /N. Bias b: systematic difference. Error ε: random error about bias ɛ 2 = b 2 ε 2 Must be cross-validated or badly biased Prediction error!0.5 0.0 0.5 rmse error bias rmse Prediction error!1.0 0.0 0.5 1.0 Prediction within training set Real Prediction error!1.0 0.0 0.5 1.0 Cross validation Real Vegetation Analysis Gradient Analysis Slide 64 Vegetation Analysis Gradient Analysis Slide 65 Regression and Bioindication Bioindication: Likelihood approach Likelihood is the probability of a given observed value with a certain expected value Regression: We know the gradient values x and observed species abundances y We find the most likely expected values ˆµ for species Maximum likelihood estimation: Expected values that give the best likelihood for observations. ML estimates are close to observed values, and the proximity is measured with the likelihood function Commonly we use the negative logarithm of the likelihood, since combined probabilities may be very small Bioindication: We know the observed species abundances y We have a gradient model that gives the expected abundances µ for any gradient value x We find the most likely gradient values ˆx that maximize the likelihood of observing y when expecting µ ML Bioindication can be used with many response models and with many gradients
Vegetation Analysis Gradient Analysis Slide 66 Finding elevation from species composition!loglik 30 50 70 90 900 1400 (m)