Introductory compositional data (CoDa)analysis for soil

Size: px
Start display at page:

Download "Introductory compositional data (CoDa)analysis for soil"

Transcription

1 Introductory compositional data (CoDa)analysis for soil 1 scientists Léon E. Parent, department of Soils and Agrifood Engineering Université Laval, Québec

2 2 Definition (Aitchison, 1986) Compositional data are strictly positive data constrained to a closed space (0,1) defined by a d-dimensional simplex S d Open simplex: S D = u u d < 1 Closed simplex: S D = u u d+1 = 1 where u d+1 is the filling value to 1 As part of some entity, compositional data are intrinsically multivariate and interrelated, some part «filling» information holes left by others

3 3 Source of biais in CoDa Sources of bias (distortion) when analyzing CoDa Redundancy: there are D-1 degrees of freedom for a D-part composition (one component is redundant as computed by difference) Subcompositional incoherence (scale dependency): results of statistical analyses depend on the basis (e.g. dry, organic or wet mass): adding more components should not influence the conclusion about any subcomposition Non-normal distribution (confidence intervals may scan beyond the compositional space, i.e. below 0 or above 100%

4 4 Redundancy The compositional space is constrained between 0 and some whole such as 1, 100%, 1000 g kg -1, 10 6 mg kg -1, 1 m 3 m -3, etc. Soil texture can be defined as %sand + %silt + %clay. Hence, %sand can be computed by difference between 100% and the sum of %silt and %clay Soil texture can be illustrated by a ternary diagram or a 2-D scatter diagram

5 5 Redundancy bubble for a D-part composition (Parent et al., 2012) Dual ratio a b: D D 1 2 a b + c Parent et al., 2012)

6 6 Subcompositional incoherence (scale dependency) Results of a statistical analysis should not depend on the scale of measurement, but this is too often ignored Sand + silt + clay + OM + water = 100% Sand + silt + clay + OM = 100% Sand + silt + clay = 100% Soil texture is 40% sand, 35% silt, and 25% clay; SOM = 4%; water content = 20% Percentage of textural components depends on the basis Subcomposition (scale) Sand Silt Clay SOM Water Sum Texture Texture + SOM Texture + SOM + water

7 7 Spurious correlations Correlation coefficients vary with measurement scale or subcomposition Closure to 100% depends on what composition makes up this 100% (scale dependency) CoDa are relative to each other, not asbolute values Mixture contains12 g of B and 8 g of E hence a 20 g subcomposition made of B+E B makes 60% of this subcomposition closed to 20 g and E, 40% B E = 12 8 = = 1,5 for this subcomposition (scale invariance) If total mass of the mixture is 80 g, the B E ratio remains the same (1.5) although the proportion of B in the whole mixture becomes 15% and that of E, 10% (scale-dependency). To back transform ratios into operational units (here 20 g), solve 2 equations for 2 unknowns B E = 1,5 and B + E = 20g

8 8 Spurious correlations One component as basis: a random selection of a 3-part composition (correlation expected to be zero) will generate spurious correlation if two parts are ratioed on the third part (Pearson, 1897) x 1 = w 1 w 3 ; x 2 = w 2 w 3 Spurious CORR α 3 2 α 1 2 +α 3 2 α 2 2 +α 3 2 (Aitchison, 1986) α 1, α 2, α 3 =coefficients of variations of w 1, w 2, w 3 Composition as basis (e.g. concentrations): large source of spurious correlations (Aitchison, 1986): x 1 = w 1 w 1 + w 2 + w 3 ; x 2 = w 2 w 1 + w 2 + w 3 ; x 3 = w 3 w 1 + w 2 + w 3, where the same part is at numerator and denominator Arctic Lake data (sand=w 1, silt=w 2, clay=w 3 ) (Aitchison, 1986, p. 359) Correlation between x 1 and x 2 = 0.50 Spurious correlation = 0.45, hence 90% of total correlation is meaningless (see excell sheet)

9 Weltje, G.J., Quantitative analysis of dentrial modes: statistically rigourous confidence regions in ternary diagrams and their use in sedimentatry petrology. Earth-Science Reviews, 57, Normally distributed data and their CI are within the real space (± ) Non normality

10 10 Log ratios (log contrast) Data transformation: Compositional space (raw) Real space (log ratio) lr = ln g r i ; if g r g s i +, lr + ; if g s i +, lr. Hence, the lr and CI i can scan the real space CoDa transformation methods: alr, clr, ilr

11 11 Additive log ratio transformation (Aitchison, 1986) alr i = ln u i u d+1 = ln u i ln u d+1 u i = some proportion of the whole d u d+1 = filling value (F v ) =1-1=1 u d Ingestad s (1987) stoichiometric ratios: P/N, K/N,, where N relpaces u d+1 D-1 degrees of freedom alrs are at 60 o angle from each other, hence not directly compatible witheuclidean geometry Silt = 35%; clay = 25%; hence sand = 100% - 35% - 25% = 40% alr silt = ln = 0,134 alr clay = ln = 0,470 Back-transform alr means into familiar units from the alr values and the bounded sum (here 100%) after statistical analysis

12 12 Centred log ratio transformation (Aitchison, 1986) clr i = ln u i g u i u i = some proportion of the whole d u d+1 =1-1=1 u d = filling value (F v ) Biplot analysis Provided proper geometry to the DRIS approach (Parent & Dafir, 1992) Silt = 35%; clay = 25%; hence sand = 100% - 35% - 25% = 40% clr sand = ln clr silt = ln clr clay = ln = = = i=1 clr 1 = 0, hence singularity (one clr excluded from statistical analysis, in general u d+1 )

13 13 Nutrient diagnosis (theories in plant nutrition) based on Law of minimum Liebig s barrel h=834&tbm=isch&imgil=bzoajpmonm_jum%253a%253bogdix tcpt9d9m%253bhttp%25253a%25252f%25252fwww.bestseedbank.com%25252f%25253fattachment_id% d8642&source=i u&pf=m&fir=bzoajpmonm_jum%253a%252cogdixtcpt9d9m%252c_&usg= NB76ewraB4PpBOa7oVzkqo2NAU%3D&ved=0CDcQyjdqFQoTCLOz5NrDxsYCFQ ptpgodqayheg&ei=l3eavfolnyra- QGpzZ6QAQ#imgrc=9RvzYO1EnSl3FM%3A&usg= NB76ewraB 4PpBO-a7oVzkqo2NAU%3D

14 14 Source of distortion in ceteris paribus (Law of minimum) shown by clr If the distance between the two compositions is computed as ln u i ln v i, all other nutrients are assumed to be equal and nutrient interactions to be negligible. Compositional vector u clr i = ln u i g u i = ln u i ln g u i Compositional vector v clr i = ln v i g v i = ln v i ln g v i The ceteris paribus assumption holds if and only if ln g u i ln g v i Clr is a geometrically correct formulation for DRIS (Parent and Dafir, 1992)

15 15 Isometric log ratio transformation (Egozcue et al., 2003) ilr i = rs ln g r i r+s balances g s i, rs r+s = normalization coefficient to compare different Log contrast between two non-overlaping sub-compositions (orthogonality), also called «orthonormal balance» Sequential binary partition (SBP): arrangement of balances D-1 degrees of freedom and Euclidean geometry Silt = 35%; clay = 25%; hence sand = 100% - 35% - 25% = 40% ilr clay silt,sand] = 2 1 ln = ilr silt sand] = ln =

16 Balance domain 16 Sequential binary partition and mobilefulcrums-buckets design to define balances Partition Sand Silt Clay r s Hierarchy Concentration domain

17 17 Chemical equilibrium as log contrast α 1 x 1 +α 2 x 2 α 3 x 3 +α 4 x 4 K e = ln x α1 1 x α2 2 x α3 3 x α4 4 K e or lr Ke = ln x 1 α 1 x 2 α 2 ln x3 α 3 x 4 α 4 = α1 ln x 1 + α 2 ln x 2 α 3 ln x 3 + α 4 ln x 4 Reactives: lr reactives = α 1 ln x 1 α 2 ln x 2 Products: lr products = α 3 ln x 3 α 4 ln x 4 Ilr x 1 x 2 x 3 x x 1 x 2 x 3 x 4 Reactives Products

18 Balance domain 18 Scale-independent orthonormal balance: subcompositional coherence Concentration (proportion) domain

19 19 Understanding behind balances Multivariate distances are independent from the way balances are arranged in the SBP or the mobile No knowledge: unstructured balances Meaningful balances Scientific theories or hypotheses (e.g. anions vs. cations) and hierarchical arrangement: literature review Nutrient management fertilizers to feed crops(n, P, K) or lime (to neutralize exchangeable acidity Ca and Mg carbonates) Nutrient budgets: input and output components Bi-plot (correlations): data driven, no cause-and-effect validation

20 20 Brasileira receita equilibrada do bolo de fubá (Marisa N.)

21 21 Mobile with 6 balances and 7 ingredients centred to recipe 3 4 ln Baking powder 3 Sugar Maize Wheat 2 3 ln Sugar Maize Wheat 1 Maize ln 2 Wheat

22 22 ILR as data transformation technique Univariate analysis ANOVA for each variable Multivariate analysis (all ilrs or concentrations included) Measure of bias from ordinary log-transformed or raw proportions or concentrations Holistic classification of specimens (no ad hoc SBP needed) Diagnosis (ad hoc SBP needed to interpret balances) Others Regression analysis Ilrs vs. ordinary log-transformed or raw concentrations as independent variables: performance influenced by correlations and overfitting Variates (variables combined across compositons such as PC) : ilrs remove bias

23 ilr1 23 Multivariate distance ilr i ilr i φ 1 ilr i ilr i If φ=identity matrix, Euclidean (Aitchison) distance If φ=variance matrix, chi-square variable If φ=covariance matrix, Mahalanobis distance Any composition can be compared holistically to a reference composition (e.g. initial composition or the composition of a performing system) for classification or diagnostic purposes Euclidean distance (hypothenuse) ilr2

24 24 Distortion (departure from 45 o line)in raw nutrient CoDa due to nutrient interactions

25 25 Sound balances between anionic and cationic macronutrients o o o o

26 26 Complex nutrient balances in plants avoiding ceteris paribus

27 27 Receiver operation characteristic (ROC) for specimen classification

28 28 Means and SD of balances at fulcrums and concentrations in buckets for diagnostic purposes

29 29 Subcompositional N, P, K diagnosis where P is varied to reach the reference cloud

30 30 Nutrient signature of mango varieties (left) and of soil properties (right) to test genetic vs. environment effects on foliar compositions

31 31 Classification of nutrient signatures (N, P, K, Ca, Mg) among wild (left) and domesticated fruit species

32 32 Challenges Revisit non-compositional models with balances Soil & Plant Sciences (equations, quality indices, classification, diagnosis, ) Geosciences, Agronomy, Ecology (countings, proprotions, nutrient cycles and budgets, sustainability indices, ) Support paradigm change from interpreting a posteriori computer-driven results to a priori sound understanding of the relational data structure Contribute to special issues using CoDa tools Teach compositional models and train a new generation of researchers

33 33 Key papers Aitchison, J., Shen, S.M Logistic-normal distributions: some properties and uses. Biometrika 67(2): Aitchison, J The statistical analysis of compositional data. Chapman Hall, London. Aitchison, J., Greenacre, M Biplots of compositional data. J. Royal Stat. Soc. Series C (Appl. Stat.) 51(4): Filzmoser, P., Hron, K., Reimann, C Univariate statistical analysis of environmental (compositional) data: Problems and possibilities. Sci. Total Environ. 407(23): Egozcue, J.J., Pawlowsky-Glahn, V., Mateu-Figueras, G., Barceló-Vidal, C Isometric log-ratio transformations for compositional data analysis. Math. Geol. 35: Egozcue, J.J., Pawlowsky-Glahn, V Groups of parts and their balances in compositional data analysis. Mathematical Geology 37: Parent, S.- É., Parent, L. E. Rozane, D. E. Hernandes, A., Natale, W Nutrient balance as paradigm of plant and soil chemometrics. Chapter 4 (32 pp.). In Issaka, R.N. (ed.). Soil Fertility, InTech Publ., Pearson, K., Mathematical contributions to the theory of evolution. On a form of spurious correlation which may arise when indices are used in the measurement of organs. Proceedings of the Royal Society of London LX,

34 34 Some of our open-access compositional papers in agronomy of-plant-and-soil-chemometricsnutrient-balance-as-paradigm-of-soil-and- w_zealand_kiwifruit_%28_actinidia_deliciosa_%29_at_high_yield_level

35 35 Codapack Import data Log ratio transformations Balance dendogram Biplots

CoDa-dendrogram: A new exploratory tool. 2 Dept. Informàtica i Matemàtica Aplicada, Universitat de Girona, Spain;

CoDa-dendrogram: A new exploratory tool. 2 Dept. Informàtica i Matemàtica Aplicada, Universitat de Girona, Spain; CoDa-dendrogram: A new exploratory tool J.J. Egozcue 1, and V. Pawlowsky-Glahn 2 1 Dept. Matemàtica Aplicada III, Universitat Politècnica de Catalunya, Barcelona, Spain; juan.jose.egozcue@upc.edu 2 Dept.

More information

Methodological Concepts for Source Apportionment

Methodological Concepts for Source Apportionment Methodological Concepts for Source Apportionment Peter Filzmoser Institute of Statistics and Mathematical Methods in Economics Vienna University of Technology UBA Berlin, Germany November 18, 2016 in collaboration

More information

Regression with Compositional Response. Eva Fišerová

Regression with Compositional Response. Eva Fišerová Regression with Compositional Response Eva Fišerová Palacký University Olomouc Czech Republic LinStat2014, August 24-28, 2014, Linköping joint work with Karel Hron and Sandra Donevska Objectives of the

More information

arxiv: v2 [stat.me] 16 Jun 2011

arxiv: v2 [stat.me] 16 Jun 2011 A data-based power transformation for compositional data Michail T. Tsagris, Simon Preston and Andrew T.A. Wood Division of Statistics, School of Mathematical Sciences, University of Nottingham, UK; pmxmt1@nottingham.ac.uk

More information

THE CLOSURE PROBLEM: ONE HUNDRED YEARS OF DEBATE

THE CLOSURE PROBLEM: ONE HUNDRED YEARS OF DEBATE Vera Pawlowsky-Glahn 1 and Juan José Egozcue 2 M 2 1 Dept. of Computer Science and Applied Mathematics; University of Girona; Girona, SPAIN; vera.pawlowsky@udg.edu; 2 Dept. of Applied Mathematics; Technical

More information

The Dirichlet distribution with respect to the Aitchison measure on the simplex - a first approach

The Dirichlet distribution with respect to the Aitchison measure on the simplex - a first approach The irichlet distribution with respect to the Aitchison measure on the simplex - a first approach G. Mateu-Figueras and V. Pawlowsky-Glahn epartament d Informàtica i Matemàtica Aplicada, Universitat de

More information

Discriminant analysis for compositional data and robust parameter estimation

Discriminant analysis for compositional data and robust parameter estimation Noname manuscript No. (will be inserted by the editor) Discriminant analysis for compositional data and robust parameter estimation Peter Filzmoser Karel Hron Matthias Templ Received: date / Accepted:

More information

Updating on the Kernel Density Estimation for Compositional Data

Updating on the Kernel Density Estimation for Compositional Data Updating on the Kernel Density Estimation for Compositional Data Martín-Fernández, J. A., Chacón-Durán, J. E., and Mateu-Figueras, G. Dpt. Informàtica i Matemàtica Aplicada, Universitat de Girona, Campus

More information

Error Propagation in Isometric Log-ratio Coordinates for Compositional Data: Theoretical and Practical Considerations

Error Propagation in Isometric Log-ratio Coordinates for Compositional Data: Theoretical and Practical Considerations Math Geosci (2016) 48:941 961 DOI 101007/s11004-016-9646-x ORIGINAL PAPER Error Propagation in Isometric Log-ratio Coordinates for Compositional Data: Theoretical and Practical Considerations Mehmet Can

More information

Principal balances.

Principal balances. Principal balances V. PAWLOWSKY-GLAHN 1, J. J. EGOZCUE 2 and R. TOLOSANA-DELGADO 3 1 Dept. Informàtica i Matemàtica Aplicada, U. de Girona, Spain (vera.pawlowsky@udg.edu) 2 Dept. Matemàtica Aplicada III,

More information

Exploring Compositional Data with the CoDa-Dendrogram

Exploring Compositional Data with the CoDa-Dendrogram AUSTRIAN JOURNAL OF STATISTICS Volume 40 (2011), Number 1 & 2, 103-113 Exploring Compositional Data with the CoDa-Dendrogram Vera Pawlowsky-Glahn 1 and Juan Jose Egozcue 2 1 University of Girona, Spain

More information

Mining. A Geostatistical Framework for Estimating Compositional Data Avoiding Bias in Back-transformation. Mineração. Abstract. 1.

Mining. A Geostatistical Framework for Estimating Compositional Data Avoiding Bias in Back-transformation. Mineração. Abstract. 1. http://dx.doi.org/10.1590/0370-4467015690041 Ricardo Hundelshaussen Rubio Engenheiro Industrial, MSc, Doutorando Universidade Federal do Rio Grande do Sul - UFRS Departamento de Engenharia de Minas Porto

More information

The Mathematics of Compositional Analysis

The Mathematics of Compositional Analysis Austrian Journal of Statistics September 2016, Volume 45, 57 71. AJS http://www.ajs.or.at/ doi:10.17713/ajs.v45i4.142 The Mathematics of Compositional Analysis Carles Barceló-Vidal University of Girona,

More information

Groups of Parts and Their Balances in Compositional Data Analysis 1

Groups of Parts and Their Balances in Compositional Data Analysis 1 Mathematical Geology, Vol. 37, No. 7, October 2005 ( C 2005) DOI: 10.1007/s11004-005-7381-9 Groups of Parts and Their Balances in Compositional Data Analysis 1 J. J. Egozcue 2 and V. Pawlowsky-Glahn 3

More information

Principal component analysis for compositional data with outliers

Principal component analysis for compositional data with outliers ENVIRONMETRICS Environmetrics 2009; 20: 621 632 Published online 11 February 2009 in Wiley InterScience (www.interscience.wiley.com).966 Principal component analysis for compositional data with outliers

More information

CZ-77146, Czech Republic b Institute of Statistics and Probability Theory, Vienna University. Available online: 06 Jan 2012

CZ-77146, Czech Republic b Institute of Statistics and Probability Theory, Vienna University. Available online: 06 Jan 2012 This article was downloaded by: [Karel Hron] On: 06 January 2012, At: 11:23 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer

More information

EXPLORATION OF GEOLOGICAL VARIABILITY AND POSSIBLE PROCESSES THROUGH THE USE OF COMPOSITIONAL DATA ANALYSIS: AN EXAMPLE USING SCOTTISH METAMORPHOSED

EXPLORATION OF GEOLOGICAL VARIABILITY AND POSSIBLE PROCESSES THROUGH THE USE OF COMPOSITIONAL DATA ANALYSIS: AN EXAMPLE USING SCOTTISH METAMORPHOSED 1 EXPLORATION OF GEOLOGICAL VARIABILITY AN POSSIBLE PROCESSES THROUGH THE USE OF COMPOSITIONAL ATA ANALYSIS: AN EXAMPLE USING SCOTTISH METAMORPHOSE C. W. Thomas J. Aitchison British Geological Survey epartment

More information

SIMPLICIAL REGRESSION. THE NORMAL MODEL

SIMPLICIAL REGRESSION. THE NORMAL MODEL Journal of Applied Probability and Statistics Vol. 6, No. 1&2, pp. 87-108 ISOSS Publications 2012 SIMPLICIAL REGRESSION. THE NORMAL MODEL Juan José Egozcue Dept. Matemàtica Aplicada III, U. Politècnica

More information

Appendix 07 Principal components analysis

Appendix 07 Principal components analysis Appendix 07 Principal components analysis Data Analysis by Eric Grunsky The chemical analyses data were imported into the R (www.r-project.org) statistical processing environment for an evaluation of possible

More information

Science of the Total Environment

Science of the Total Environment Science of the Total Environment 408 (2010) 4230 4238 Contents lists available at ScienceDirect Science of the Total Environment journal homepage: www.elsevier.com/locate/scitotenv The bivariate statistical

More information

A Critical Approach to Non-Parametric Classification of Compositional Data

A Critical Approach to Non-Parametric Classification of Compositional Data A Critical Approach to Non-Parametric Classification of Compositional Data J. A. Martín-Fernández, C. Barceló-Vidal, V. Pawlowsky-Glahn Dept. d'informàtica i Matemàtica Aplicada, Escola Politècnica Superior,

More information

Statistical Analysis of. Compositional Data

Statistical Analysis of. Compositional Data Statistical Analysis of Compositional Data Statistical Analysis of Compositional Data Carles Barceló Vidal J Antoni Martín Fernández Santiago Thió Fdez-Henestrosa Dept d Informàtica i Matemàtica Aplicada

More information

A Program for Data Transformations and Kernel Density Estimation

A Program for Data Transformations and Kernel Density Estimation A Program for Data Transformations and Kernel Density Estimation John G. Manchuk and Clayton V. Deutsch Modeling applications in geostatistics often involve multiple variables that are not multivariate

More information

The bivariate statistical analysis of environmental (compositional) data

The bivariate statistical analysis of environmental (compositional) data The bivariate statistical analysis of environmental (compositional) data Peter Filzmoser a, Karel Hron b, Clemens Reimann c a Vienna University of Technology, Department of Statistics and Probability Theory,

More information

Bayes spaces: use of improper priors and distances between densities

Bayes spaces: use of improper priors and distances between densities Bayes spaces: use of improper priors and distances between densities J. J. Egozcue 1, V. Pawlowsky-Glahn 2, R. Tolosana-Delgado 1, M. I. Ortego 1 and G. van den Boogaart 3 1 Universidad Politécnica de

More information

Time Series of Proportions: A Compositional Approach

Time Series of Proportions: A Compositional Approach Time Series of Proportions: A Compositional Approach C. Barceló-Vidal 1 and L. Aguilar 2 1 Dept. Informàtica i Matemàtica Aplicada, Campus de Montilivi, Univ. de Girona, E-17071 Girona, Spain carles.barcelo@udg.edu

More information

On the interpretation of differences between groups for compositional data

On the interpretation of differences between groups for compositional data Statistics & Operations Research Transactions SORT 39 (2) July-December 2015, 1-22 ISSN: 1696-2281 eissn: 2013-8830 www.idescat.cat/sort/ Statistics & Operations Research Institut d Estadística de Catalunya

More information

An EM-Algorithm Based Method to Deal with Rounded Zeros in Compositional Data under Dirichlet Models. Rafiq Hijazi

An EM-Algorithm Based Method to Deal with Rounded Zeros in Compositional Data under Dirichlet Models. Rafiq Hijazi An EM-Algorithm Based Method to Deal with Rounded Zeros in Compositional Data under Dirichlet Models Rafiq Hijazi Department of Statistics United Arab Emirates University P.O. Box 17555, Al-Ain United

More information

Dealing With Zeros and Missing Values in Compositional Data Sets Using Nonparametric Imputation 1

Dealing With Zeros and Missing Values in Compositional Data Sets Using Nonparametric Imputation 1 Mathematical Geology, Vol. 35, No. 3, April 2003 ( C 2003) Dealing With Zeros and Missing Values in Compositional Data Sets Using Nonparametric Imputation 1 J. A. Martín-Fernández, 2 C. Barceló-Vidal,

More information

Statistical methods for the analysis of microbiome compositional data in HIV studies

Statistical methods for the analysis of microbiome compositional data in HIV studies 1/ 56 Statistical methods for the analysis of microbiome compositional data in HIV studies Javier Rivera Pinto November 30, 2018 Outline 1 Introduction 2 Compositional data and microbiome analysis 3 Kernel

More information

1 Introduction. 2 A regression model

1 Introduction. 2 A regression model Regression Analysis of Compositional Data When Both the Dependent Variable and Independent Variable Are Components LA van der Ark 1 1 Tilburg University, The Netherlands; avdark@uvtnl Abstract It is well

More information

Soil Fertility. Fundamentals of Nutrient Management June 1, Patricia Steinhilber

Soil Fertility. Fundamentals of Nutrient Management June 1, Patricia Steinhilber Soil Fertility Fundamentals of Nutrient Management June 1, 2010 Patricia Steinhilber Ag Nutrient Management Program University of Maryland College Park Main Topics plant nutrition functional soil model

More information

The k-nn algorithm for compositional data: a revised approach with and without zero values present

The k-nn algorithm for compositional data: a revised approach with and without zero values present Journal of Data Science 12(2014), 519-534 The k-nn algorithm for compositional data: a revised approach with and without zero values present Michail Tsagris 1 1 School of Mathematical Sciences, University

More information

Signal Interpretation in Hotelling s T 2 Control Chart for Compositional Data

Signal Interpretation in Hotelling s T 2 Control Chart for Compositional Data Signal Interpretation in Hotelling s T 2 Control Chart for Compositional Data Marina Vives-Mestres, Josep Daunis-i-Estadella and Josep-Antoni Martín-Fernández Department of Computer Science, Applied Mathematics

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

Regression with compositional response having unobserved components or below detection limit values

Regression with compositional response having unobserved components or below detection limit values Regression with compositional response having unobserved components or below detection limit values Karl Gerald van den Boogaart 1 2, Raimon Tolosana-Delgado 1 2, and Matthias Templ 3 1 Department of Modelling

More information

Practical Aspects of Log-ratio Coordinate Representations in Regression with Compositional Response

Practical Aspects of Log-ratio Coordinate Representations in Regression with Compositional Response Journal homepage: http://www.degruyter.com/view/j/msr Practical Aspects of Log-ratio Coordinate Representations in Regression with Compositional Response Eva Fišerová 1, Sandra Donevska 1, Karel Hron 1,

More information

Soils. Source: Schroeder and Blum, 1992

Soils. Source: Schroeder and Blum, 1992 Soils Source: Schroeder and Blum, 1992 Literature cited: Schroeder, D. and Blum, W.E.H. 1992. Bodenkunde in Stichworten. Gebrüder Borntraeger, D-1000 Berlin. Geology and Life Conceptual model Source: Knight,

More information

Using the R package compositions

Using the R package compositions Using the R package compositions K. Gerald van den Boogaart Version 0.9, 1. June 2005 (C) by Gerald van den Boogaart, Greifswald, 2005 Abstract compositions is a package for the the analysis of (e.g. chemical)

More information

Be sure to show all calculations so that you can receive partial credit for your work!

Be sure to show all calculations so that you can receive partial credit for your work! Agronomy 365T Exam 1 Spring 2004 Exam Score: Name TA Lab Hour Be sure to show all calculations so that you can receive partial credit for your work! 1) List 14 of the plant essential nutrient for plant

More information

Running head: TITLE 1. Lab Report. Name. Academic Institution. Author Note. Class. Professor

Running head: TITLE 1. Lab Report. Name. Academic Institution. Author Note. Class. Professor Running head: TITLE 1 Lab Report Name Academic Institution Author Note Class Professor March 13, 2018 TITLE 2 Soil colloids tend to have a net negative charge for two reasons: surface area and ph levels.

More information

Fundamentals to Biostatistics. Prof. Chandan Chakraborty Associate Professor School of Medical Science & Technology IIT Kharagpur

Fundamentals to Biostatistics. Prof. Chandan Chakraborty Associate Professor School of Medical Science & Technology IIT Kharagpur Fundamentals to Biostatistics Prof. Chandan Chakraborty Associate Professor School of Medical Science & Technology IIT Kharagpur Statistics collection, analysis, interpretation of data development of new

More information

Reef composition data come from the intensive surveys of the Australian Institute of Marine Science

Reef composition data come from the intensive surveys of the Australian Institute of Marine Science 1 Stochastic dynamics of a warmer Great Barrier Reef 2 Jennifer K Cooper, Matthew Spencer, John F Bruno 3 Appendices (additional methods and results) 4 A.1 Data sources 5 6 7 8 Reef composition data come

More information

Discriminant analysis and supervised classification

Discriminant analysis and supervised classification Discriminant analysis and supervised classification Angela Montanari 1 Linear discriminant analysis Linear discriminant analysis (LDA) also known as Fisher s linear discriminant analysis or as Canonical

More information

An affine equivariant anamorphosis for compositional data. presenting author

An affine equivariant anamorphosis for compositional data. presenting author An affine equivariant anamorphosis for compositional data An affine equivariant anamorphosis for compositional data K. G. VAN DEN BOOGAART, R. TOLOSANA-DELGADO and U. MUELLER Helmholtz Institute for Resources

More information

Propensity score matching for multiple treatment levels: A CODA-based contribution

Propensity score matching for multiple treatment levels: A CODA-based contribution Propensity score matching for multiple treatment levels: A CODA-based contribution Hajime Seya *1 Graduate School of Engineering Faculty of Engineering, Kobe University, 1-1 Rokkodai-cho, Nada-ku, Kobe

More information

Revision: Chapter 1-6. Applied Multivariate Statistics Spring 2012

Revision: Chapter 1-6. Applied Multivariate Statistics Spring 2012 Revision: Chapter 1-6 Applied Multivariate Statistics Spring 2012 Overview Cov, Cor, Mahalanobis, MV normal distribution Visualization: Stars plot, mosaic plot with shading Outlier: chisq.plot Missing

More information

Interpretation of Compositional Regression with Application to Time Budget Analysis

Interpretation of Compositional Regression with Application to Time Budget Analysis Austrian Journal of Statistics February 2018, Volume 47, 3 19 AJS http://wwwajsorat/ doi:1017713/ajsv47i2652 Interpretation of Compositional Regression with Application to Time Budget Analysis Ivo Müller

More information

arxiv: v3 [stat.me] 23 Oct 2017

arxiv: v3 [stat.me] 23 Oct 2017 Means and covariance functions for geostatistical compositional data: an axiomatic approach Denis Allard a, Thierry Marchant b arxiv:1512.05225v3 [stat.me] 23 Oct 2017 a Biostatistics and Spatial Processes,

More information

Approaching predator-prey Lotka-Volterra equations by simplicial linear differential equations

Approaching predator-prey Lotka-Volterra equations by simplicial linear differential equations Approaching predator-prey Lotka-Volterra equations by simplicial linear differential equations E. JARAUTA-BRAGULAT 1 and J. J. EGOZCUE 1 1 Dept. Matemàtica Aplicada III, U. Politècnica de Catalunya, Barcelona,

More information

Classification of Compositional Data Using Mixture Models: a Case Study Using Granulometric Data

Classification of Compositional Data Using Mixture Models: a Case Study Using Granulometric Data Classification of Compositional Data Using Mixture Models 1 Classification of Compositional Data Using Mixture Models: a Case Study Using Granulometric Data C. Barceló 1, V. Pawlowsky 2 and G. Bohling

More information

Compositional data methods for microbiome studies

Compositional data methods for microbiome studies Compositional data methods for microbiome studies M.Luz Calle Dept. of Biosciences, UVic-UCC http://mon.uvic.cat/bms/ http://mon.uvic.cat/master-omics/ 1 Important role of the microbiome in human health

More information

Some Practical Aspects on Multidimensional Scaling of Compositional Data 2 1 INTRODUCTION 1.1 The sample space for compositional data An observation x

Some Practical Aspects on Multidimensional Scaling of Compositional Data 2 1 INTRODUCTION 1.1 The sample space for compositional data An observation x Some Practical Aspects on Multidimensional Scaling of Compositional Data 1 Some Practical Aspects on Multidimensional Scaling of Compositional Data J. A. Mart n-fernández 1 and M. Bren 2 To visualize the

More information

Prerequisite: STATS 7 or STATS 8 or AP90 or (STATS 120A and STATS 120B and STATS 120C). AP90 with a minimum score of 3

Prerequisite: STATS 7 or STATS 8 or AP90 or (STATS 120A and STATS 120B and STATS 120C). AP90 with a minimum score of 3 University of California, Irvine 2017-2018 1 Statistics (STATS) Courses STATS 5. Seminar in Data Science. 1 Unit. An introduction to the field of Data Science; intended for entering freshman and transfers.

More information

COMPOSITIONAL DATA ANALYSIS: WHERE ARE WE AND WHERE SHOULD WE BE HEADING? John Aitchison

COMPOSITIONAL DATA ANALYSIS: WHERE ARE WE AND WHERE SHOULD WE BE HEADING? John Aitchison COMPOSITIONAL DATA ANALYSIS: WHERE ARE WE AND WHERE SHOULD WE BE HEADING? John Aitchison Department of Statistics, University of Glasgow, Glasgow G2 8QQ, Scotland Address for correspondence: Rosemount,

More information

Regression analysis with compositional data containing zero values

Regression analysis with compositional data containing zero values Chilean Journal of Statistics Vol. 6, No. 2, September 2015, 47 57 General Nonlinear Regression Research Paper Regression analysis with compositional data containing zero values Michail Tsagris Department

More information

CLASS EXERCISE 5.1 List processes occurring in soils that cause changes in the levels of ions.

CLASS EXERCISE 5.1 List processes occurring in soils that cause changes in the levels of ions. 5 SIL CHEMISTRY 5.1 Introduction A knowledge of the chemical composition of a soil is less useful than a knowledge of its component minerals and organic materials. These dictate the reactions that occur

More information

1 Interpretation. Contents. Biplots, revisited. Biplots, revisited 2. Biplots, revisited 1

1 Interpretation. Contents. Biplots, revisited. Biplots, revisited 2. Biplots, revisited 1 Biplots, revisited 1 Biplots, revisited 2 1 Interpretation Biplots, revisited Biplots show the following quantities of a data matrix in one display: Slide 1 Ulrich Kohler kohler@wz-berlin.de Slide 3 the

More information

On Bicompositional Correlation. Bergman, Jakob. Link to publication

On Bicompositional Correlation. Bergman, Jakob. Link to publication On Bicompositional Correlation Bergman, Jakob 2010 Link to publication Citation for published version (APA): Bergman, J. (2010). On Bicompositional Correlation General rights Copyright and moral rights

More information

Volume Composition of a Desirable Surface Soil

Volume Composition of a Desirable Surface Soil Soil Chemistry Volume Composition of a Desirable Surface Soil 50% pore space 25% air 45 to 48% mineral matter 50% solid material 25% water 2 to 5% organic matter Soil Organic Matter Soil organic matter:

More information

BIOL Biometry LAB 6 - SINGLE FACTOR ANOVA and MULTIPLE COMPARISON PROCEDURES

BIOL Biometry LAB 6 - SINGLE FACTOR ANOVA and MULTIPLE COMPARISON PROCEDURES BIOL 458 - Biometry LAB 6 - SINGLE FACTOR ANOVA and MULTIPLE COMPARISON PROCEDURES PART 1: INTRODUCTION TO ANOVA Purpose of ANOVA Analysis of Variance (ANOVA) is an extremely useful statistical method

More information

Modified Kolmogorov-Smirnov Test of Goodness of Fit. Catalonia-BarcelonaTECH, Spain

Modified Kolmogorov-Smirnov Test of Goodness of Fit. Catalonia-BarcelonaTECH, Spain 152/304 CoDaWork 2017 Abbadia San Salvatore (IT) Modified Kolmogorov-Smirnov Test of Goodness of Fit G.S. Monti 1, G. Mateu-Figueras 2, M. I. Ortego 3, V. Pawlowsky-Glahn 2 and J. J. Egozcue 3 1 Department

More information

Estimating representational dissimilarity measures

Estimating representational dissimilarity measures Estimating representational dissimilarity measures lexander Walther MRC Cognition and rain Sciences Unit University of Cambridge Institute of Cognitive Neuroscience University

More information

A latent Gaussian model for compositional data with zeroes

A latent Gaussian model for compositional data with zeroes A latent Gaussian model for compositional data with zeroes Adam Butler and Chris Glasbey Biomathematics & Statistics Scotland, Edinburgh, UK Summary. Compositional data record the relative proportions

More information

Semi-Quantitative Analysis of Analytical Data using Chemometric Methods. Part II.

Semi-Quantitative Analysis of Analytical Data using Chemometric Methods. Part II. Semi-Quantitative Analysis of Analytical Data using Chemometric Methods. Part II. Simon Bates, Ph.D. After working through the various identification and matching methods, we are finally at the point where

More information

This is start of the single grain view

This is start of the single grain view SOIL TEXTURE, PARTICLE SIZE DISTRIBUTION, SPECIFIC SURFACE AND CLAY MINERALS We will assess the physical realm of soil science in a piecewise fashion starting with the physical phases of soil, -- a single

More information

CHAPTER 2. Types of Effect size indices: An Overview of the Literature

CHAPTER 2. Types of Effect size indices: An Overview of the Literature CHAPTER Types of Effect size indices: An Overview of the Literature There are different types of effect size indices as a result of their different interpretations. Huberty (00) names three different types:

More information

Compositional Kriging: A Spatial Interpolation Method for Compositional Data 1

Compositional Kriging: A Spatial Interpolation Method for Compositional Data 1 Mathematical Geology, Vol. 33, No. 8, November 2001 ( C 2001) Compositional Kriging: A Spatial Interpolation Method for Compositional Data 1 Dennis J. J. Walvoort 2,3 and Jaap J. de Gruijter 2 Compositional

More information

Short Note: Naive Bayes Classifiers and Permanence of Ratios

Short Note: Naive Bayes Classifiers and Permanence of Ratios Short Note: Naive Bayes Classifiers and Permanence of Ratios Julián M. Ortiz (jmo1@ualberta.ca) Department of Civil & Environmental Engineering University of Alberta Abstract The assumption of permanence

More information

A new distribution on the simplex containing the Dirichlet family

A new distribution on the simplex containing the Dirichlet family A new distribution on the simplex containing the Dirichlet family A. Ongaro, S. Migliorati, and G.S. Monti Department of Statistics, University of Milano-Bicocca, Milano, Italy; E-mail for correspondence:

More information

Correlation and regression

Correlation and regression 1 Correlation and regression Yongjua Laosiritaworn Introductory on Field Epidemiology 6 July 2015, Thailand Data 2 Illustrative data (Doll, 1955) 3 Scatter plot 4 Doll, 1955 5 6 Correlation coefficient,

More information

Identification of geochemically distinct regions at river basin scale using topography, geology and land use in cluster analysis

Identification of geochemically distinct regions at river basin scale using topography, geology and land use in cluster analysis Identification of geochemically distinct regions at river basin scale using topography, geology and land use in cluster analysis Ramirez-Munoz P. and Korre, A. Mining and Environmental Engineering Research

More information

Soil physical and chemical properties the analogy lecture. Beth Guertal Auburn University, AL

Soil physical and chemical properties the analogy lecture. Beth Guertal Auburn University, AL Soil physical and chemical properties the analogy lecture. Beth Guertal Auburn University, AL Soil Physical Properties Porosity Pore size and pore size distribution Water holding capacity Bulk density

More information

Analysis of Clays and Soils by XRD

Analysis of Clays and Soils by XRD Analysis of Clays and Soils by XRD I. Introduction Proper sample preparation is one of the most important requirements in the analysis of powder samples by X-ray diffraction (XRD). This statement is especially

More information

EXAM PRACTICE. 12 questions * 4 categories: Statistics Background Multivariate Statistics Interpret True / False

EXAM PRACTICE. 12 questions * 4 categories: Statistics Background Multivariate Statistics Interpret True / False EXAM PRACTICE 12 questions * 4 categories: Statistics Background Multivariate Statistics Interpret True / False Stats 1: What is a Hypothesis? A testable assertion about how the world works Hypothesis

More information

Experimental Design and Data Analysis for Biologists

Experimental Design and Data Analysis for Biologists Experimental Design and Data Analysis for Biologists Gerry P. Quinn Monash University Michael J. Keough University of Melbourne CAMBRIDGE UNIVERSITY PRESS Contents Preface page xv I I Introduction 1 1.1

More information

Introduction to Statistical Analysis

Introduction to Statistical Analysis Introduction to Statistical Analysis Changyu Shen Richard A. and Susan F. Smith Center for Outcomes Research in Cardiology Beth Israel Deaconess Medical Center Harvard Medical School Objectives Descriptive

More information

Estimating and modeling variograms of compositional data with occasional missing variables in R

Estimating and modeling variograms of compositional data with occasional missing variables in R Estimating and modeling variograms of compositional data with occasional missing variables in R R. Tolosana-Delgado 1, K.G. van den Boogaart 2, V. Pawlowsky-Glahn 3 1 Maritime Engineering Laboratory (LIM),

More information

ACT Science Homework Science 2, Set 1 35 Minutes 38 Questions

ACT Science Homework Science 2, Set 1 35 Minutes 38 Questions ACT Science Homework Science 2, Set 1 35 Minutes 38 Questions Passage I DIRECTIONS: There are seven passages in this test. Each passage is followed by several questions. After reading a passage, choose

More information

Compositional Canonical Correlation Analysis

Compositional Canonical Correlation Analysis Compositional Canonical Correlation Analysis Jan Graffelman 1,2 Vera Pawlowsky-Glahn 3 Juan José Egozcue 4 Antonella Buccianti 5 1 Department of Statistics and Operations Research Universitat Politècnica

More information

Compositional data analysis of element concentrations of simultaneous size-segregated PM measurements

Compositional data analysis of element concentrations of simultaneous size-segregated PM measurements Compositional data analysis of element concentrations of simultaneous size-segregated PM measurements A. Speranza, R. Caggiano, S. Margiotta and V. Summa Consiglio Nazionale delle Ricerche Istituto di

More information

EFFECTIVENESS OF SCREENED AG-LIME A REVIEW

EFFECTIVENESS OF SCREENED AG-LIME A REVIEW EFFECTIVENESS OF SCREENED AG-LIME A REVIEW Peter Bishop and Mike Hedley Fertilizer and Lime Research Centre, Massey University The fineness of agricultural limestone and its agronomic effectiveness are

More information

Acid Soil. Soil Acidity and ph

Acid Soil. Soil Acidity and ph Acid Soil Soil Acidity and ph ph ph = - log (H + ) H 2 O H + + OH - (H + ) x (OH - )= K w = 10-14 measures H + activity with an electrode (in the lab), solutions (in the field) reflects the acid intensity,

More information

Remedial Measures, Brown-Forsythe test, F test

Remedial Measures, Brown-Forsythe test, F test Remedial Measures, Brown-Forsythe test, F test Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 7, Slide 1 Remedial Measures How do we know that the regression function

More information

Soil Chemistry. Dr. Shalamar Armstrong Dr. Rob Rhykerd Department of Agriculture

Soil Chemistry. Dr. Shalamar Armstrong Dr. Rob Rhykerd Department of Agriculture Soil Chemistry Dr. Shalamar Armstrong sdarmst@ilstu.edu Dr. Rob Rhykerd rrhyker@ilstu.edu Importance of soil Feeding the world World Population & Growth Other Asia Africa India China Latin America Europe

More information

May Interpreting the impact of explanatory variables in compositional models. Joanna Morais, Christine Thomas Agnan, and Michel Simioni

May Interpreting the impact of explanatory variables in compositional models. Joanna Morais, Christine Thomas Agnan, and Michel Simioni 17 805 May 017 Interpreting the impact of explanatory variables in compositional models Joanna Morais, Christine Thomas Agnan, and Michel Simioni Interpreting the impact of explanatory variables in compositional

More information

Contents. Acknowledgments. xix

Contents. Acknowledgments. xix Table of Preface Acknowledgments page xv xix 1 Introduction 1 The Role of the Computer in Data Analysis 1 Statistics: Descriptive and Inferential 2 Variables and Constants 3 The Measurement of Variables

More information

Applied Multivariate Statistical Analysis Richard Johnson Dean Wichern Sixth Edition

Applied Multivariate Statistical Analysis Richard Johnson Dean Wichern Sixth Edition Applied Multivariate Statistical Analysis Richard Johnson Dean Wichern Sixth Edition Pearson Education Limited Edinburgh Gate Harlow Essex CM20 2JE England and Associated Companies throughout the world

More information

Exploring the uncertainty of soil water holding capacity information

Exploring the uncertainty of soil water holding capacity information Exploring the uncertainty of soil water holding capacity information Linda Lilburne*, Stephen McNeill, Tom Cuthill, Pierre Roudier Landcare Research, Lincoln, New Zealand *Corresponding author: lilburnel@landcareresearch.co.nz

More information

Unconstrained Ordination

Unconstrained Ordination Unconstrained Ordination Sites Species A Species B Species C Species D Species E 1 0 (1) 5 (1) 1 (1) 10 (4) 10 (4) 2 2 (3) 8 (3) 4 (3) 12 (6) 20 (6) 3 8 (6) 20 (6) 10 (6) 1 (2) 3 (2) 4 4 (5) 11 (5) 8 (5)

More information

Introduction to Statistics with GraphPad Prism 7

Introduction to Statistics with GraphPad Prism 7 Introduction to Statistics with GraphPad Prism 7 Outline of the course Power analysis with G*Power Basic structure of a GraphPad Prism project Analysis of qualitative data Chi-square test Analysis of quantitative

More information

APPLICATION OF NEAR INFRARED REFLECTANCE SPECTROSCOPY (NIRS) FOR MACRONUTRIENTS ANALYSIS IN ALFALFA. (Medicago sativa L.) A. Morón and D. Cozzolino.

APPLICATION OF NEAR INFRARED REFLECTANCE SPECTROSCOPY (NIRS) FOR MACRONUTRIENTS ANALYSIS IN ALFALFA. (Medicago sativa L.) A. Morón and D. Cozzolino. ID # 04-18 APPLICATION OF NEAR INFRARED REFLECTANCE SPECTROSCOPY (NIRS) FOR MACRONUTRIENTS ANALYSIS IN ALFALFA (Medicago sativa L.) A. Morón and D. Cozzolino. Instituto Nacional de Investigación Agropecuaria.

More information

Nonparametric hypothesis testing for equality of means on the simplex. Greece,

Nonparametric hypothesis testing for equality of means on the simplex. Greece, Nonparametric hypothesis testing for equality of means on the simplex Michail Tsagris 1, Simon Preston 2 and Andrew T.A. Wood 2 1 Department of Computer Science, University of Crete, Herakleion, Greece,

More information

Statistical Data Analysis Stat 3: p-values, parameter estimation

Statistical Data Analysis Stat 3: p-values, parameter estimation Statistical Data Analysis Stat 3: p-values, parameter estimation London Postgraduate Lectures on Particle Physics; University of London MSci course PH4515 Glen Cowan Physics Department Royal Holloway,

More information

Three-group ROC predictive analysis for ordinal outcomes

Three-group ROC predictive analysis for ordinal outcomes Three-group ROC predictive analysis for ordinal outcomes Tahani Coolen-Maturi Durham University Business School Durham University, UK tahani.maturi@durham.ac.uk June 26, 2016 Abstract Measuring the accuracy

More information

Geochemical Data Evaluation and Interpretation

Geochemical Data Evaluation and Interpretation Geochemical Data Evaluation and Interpretation Eric Grunsky Geological Survey of Canada Workshop 2: Exploration Geochemistry Basic Principles & Concepts Exploration 07 8-Sep-2007 Outline What is geochemical

More information

A. V T = 1 B. Ms = 1 C. Vs = 1 D. Vv = 1

A. V T = 1 B. Ms = 1 C. Vs = 1 D. Vv = 1 Geology and Soil Mechanics 55401 /1A (2002-2003) Mark the best answer on the multiple choice answer sheet. 1. Soil mechanics is the application of hydraulics, geology and mechanics to problems relating

More information

Geology and Soil Mechanics /1A ( ) Mark the best answer on the multiple choice answer sheet.

Geology and Soil Mechanics /1A ( ) Mark the best answer on the multiple choice answer sheet. Geology and Soil Mechanics 55401 /1A (2003-2004) Mark the best answer on the multiple choice answer sheet. 1. Soil mechanics is the application of hydraulics, geology and mechanics to problems relating

More information

Inference for Regression Simple Linear Regression

Inference for Regression Simple Linear Regression Inference for Regression Simple Linear Regression IPS Chapter 10.1 2009 W.H. Freeman and Company Objectives (IPS Chapter 10.1) Simple linear regression p Statistical model for linear regression p Estimating

More information

36-309/749 Experimental Design for Behavioral and Social Sciences. Dec 1, 2015 Lecture 11: Mixed Models (HLMs)

36-309/749 Experimental Design for Behavioral and Social Sciences. Dec 1, 2015 Lecture 11: Mixed Models (HLMs) 36-309/749 Experimental Design for Behavioral and Social Sciences Dec 1, 2015 Lecture 11: Mixed Models (HLMs) Independent Errors Assumption An error is the deviation of an individual observed outcome (DV)

More information