Multivariate Ordination Analyses: Principal Component Analysis. Dilys Vela
|
|
- Dwight Newman
- 6 years ago
- Views:
Transcription
1 Multivariate Ordination Analyses: Principal Component Analysis Dilys Vela Tatiana Boza
2 Multivariate Analyses A multivariate data set includes more than one variable ibl recorded dd from a number of replicate sampling or experimental units, sometimes referred to as objects.
3 If these objects are organisms, the variables might be morphological or physiological measurements If the objects are ecological sampling units, the variables might be physicochemical measurements or species abundances
4 What ordinations analyses are? Ordination is arranging items along a scale (axis) or multiples li l axes. The proposed of ordination i is summarized graphically complex relationships, extracting one or few dominant patterns from an infinite number of possible patterns. The placement of variables along an axis it is possible because the ordination it is base on the variables correlation.
5 What ordination analyses help us to see? Select the most important variables from multiple variables imagined or hypothesized. Reveal unforeseen patterns and suggest unforeseen processes.
6 What type of question can we answer with ordination analysis? In ecology, to seek and describe pattern of process. In community ecology, to describe the strongest patterns in species composition. I i i d dfi i In systematics, to recognize and to define species boundaries.
7 Multivariate Analysis Ordination Analysis Clasification (or Clustering Analysis) Direct Gradient Analysis Indirect Gradient Analysis Linear Regression (Few Species) Detrended CA (DCA) Corresponden ce Analysis (CA) (Many Species) Canonical CA (CCA) Redundancy Analysis (RDA) Distant Values Raw Data available Pi Principal i Non metric ti Coordinate Dimensional Analysis Analysis (PCoA (NMDS) Principal Components Analysis (PCA) Non metric Dimensional Analysis (NMDS) Detrended CA (DCA) Canonical CA (CCA)
8 Principal Components Analysis Principal component analysis (PCA) is a statistical technique that has been specifically developed to address data reduction. In general terms, the major aim of PCA is to reduce the complexity of the interrelationships among a potentially large number of observed variables to a relatively small number of linear combinations of them, which hare referred to as principal components. Principal components analysis finds a set of orthogonal standardized linear combinations which together explain all of the variation in the original data.
9 What are the assumptions of PCA? Assumes relationships among variables. cloud of points in p dimensional space has linear dimensions that can be effectively summarized by the principal axes. If the structure in the data is NONLINEAR (the cloud of points twists and curves its way through hp dimensional space), the principal axes will not be an efficient and informative summary of the data.
10 Considerations before to run a PCA Normal Distributions Data Outliers Transformationsf i Standardization Data Matrix
11 Normal Distributions When using PCA data normality is not essential. However, these methods are based on the correlation or covariance matrix, which is strongly affected by non normally distributed data and thepresence of outliers.
12 Data outliers Extreme values as well as outliers can have a severe influence on PCA, since they are based on the correlation or covariance matrix (Pison et al., 2003). Outliers should thus be removed prior to the statistical analysis, or statistical methods able to handle outliers should be employed, and the influence of extreme values needs to be reduced (e.g., via a suitable transformation).
13 Transformations Transformations, which change the scale of measurement of the data, in relation to meeting the normality assumption of parametric analyses and the homogeneity of variance assumption of most of these analyses. Transformations are particularly important for multivariate procedures based on eigenanalysis (e.g. principal components analysis) because covariances and correlations measure linear relationships between variables. Transformations that improve linearity will increase the Transformations that improve linearity will increase the efficiency with which the eigenanalysis extracts the eigenvectors.
14 Standardization The first stage in rotating the data cloud is to standardize the data by subtracting the mean and dividing by the standard deviation. It may be argued that we should not divide by the standard deviation. By standardizing, we are giving all species the same variation, i.e. a standard deviation of 1.
15 Data Matrix We actually can have it both ways: A PCA without dividing by the standard deviation is an analysis of the covariance matrix. A PCA in which you do indeed divide by the standard deviation is an analysis of the correlation matrix. When using species/variables measured in different units, you must use a correlation matrix.
16 Look at Descriptors Homogeneous nature? All Same Kind? Same Units? Same Order of Magnitude Heterogenous nature? Different kind? Different Units? Different order of Magnitude? S matrix R matrix (Covariance) (Correlation)
17 Advantages Disadvantages Correlation The results of There are considerable differences in the Matrix analyses for different sets of random variables standard deviations, caused mainly by differences in scale. None of the correlations is particularly large in are more directly comparable. absolute value. PCs has moderate sized coefficients for several of the variables. PCs give coefficients for standardized variables and are therefore less easy to interpret directly. Covariance Matrix PCs for the covariance matrix are each dominated by a single variable. The variances and total variance are more meaningful indices for measuring variability in data sets that are symmetric. The sensitivity of the PCs to the units of measurement used for each element of the variables. If there are large differences between the variances of the elements of the variables, then those variables whose variances are largest will tend to dominate the first few PCs.
18 Eigenvalues & Eigenvectors The eigenvectors are the loadings of the principal components spanning the new PCA coordinate system. The amount of variability contained in each principal component is expressed by the eigenvalues which are simply the variances of the scores.
19 PCA searches for the direction in the multivariate space that contains the maximum variability. This is the direction of the first principal component (PC1). The second principal p component (PC2) has to be orthogonal (perpendicular) to PC1andwill contain the maximum amount of the remaining data variability. Subsequent principal i components are found by the same principle.
20 Biplots A biplot is a visualization tool to present results of PCA. The PCA biplot is called the scaling process. The loadings(arrows) represent the elements. The lengths of the arrows in the plot are directly proportional to the variability included in the two components (PC1 and PC2) displayed, and the angle between any two arrows is a measure of the correlation between those variables.
21 Misconceptions PCA cannot cope with missing values (but neither can most other statistical methods). It does not require normality. It is not a hypothesis test. There are no clear distinctions between response variables and explanatory variables.
22 When should PCA be used? In community ecology, PCA is useful for summarizing variables whose relationships are approximately linear or at least monotonic. e.g. A PCA of many soil properties might be used to extract a few components that summarize main dimensions of soil variation PCA is generally NOT useful for ordinating community data. Why? Because relationships among species are highly nonlinear.
23 Community trends along environmenal gradients appear as horseshoes in PCA ordinations. None of the PC axes effectively summarizes the trend in species composition along the gradient. 2 Axis Beta Diversity 2R - Covariance Axis 1
24 The Horseshoe Effect Curvature of the gradient and the degree of infolding of the extremes increase with beta diversity. PCA ordinations are not useful summaries of community data except when beta diversity is very low Using correlation generally does better than covariance. This is because standardization by species improves the correlation between Euclidean distance and environmental distance.
25 What if there s more than one underlying ecological lgradient? When two or more underlying gradients with high beta diversity a horseshoe is usually not detectable. Interpretation problems are more severe.
26 Data Set
27 Morphological and anatomical variation of Calophyllum L. (Calophyllaceae) in South America. D. Vela
28 Kielmeyeroideae Calophylleae Calophyllum Neotatea Marila Mahurea Clusiella Kielmeyera Caraipa Haploclathra Poeciloneuron Mesua Kayea Mammea Kayea Caraipa Endodesmieae Endoodesmia Lebrunia Stevens, 2006 Calophyllum
29 Wurdarck & Davis (2009)
30 Distribution of Calophyllaceae species species Stevens,
31
32 Vein Resin canal
33
34
35 Calophyllum brasiliense calophyllum inophyllum/
36 There is infraspecific variation in tepal number between individuals of the same species, and between flowers from the same inflorescence. Stevens (1974,1980)
37 Calophyllum brasiliense Calophyllum lanigerum Calophyllum pisiferum
38 1. Mi Main objective 1.A To distinguish species limits of Calophyllum in South America. 2. Specific objectives 2.A To analyze morphological and anatomical variation. iti
39 Data collection for morphological observations Herbarium and personal collections. Collection sort: qualitative characteristics (Systematic Association Committee for descriptive Biological Terminology (cited by Stearn 2006). Measurement. Ruler and a digital caliper. E ldt ti Excel data matrix. Specimen collections in rows and variables in columns.
40 Leaf characters Flower characters Fruit characters External Fruit length mm Petiole length mm (PTL) Pedicel length mm (PDL) (FrLEx) Leaf length cm (LL) Perianth width mm (PW ) External Fruit width mm (FrWEx) Leaf flength at widest part cm (LWWP) Perianth length mm (PRL) Internal Fruit length mm (FrLIn) Leaf width cm (LW) Anther length mm (AL) Internal Fruit width mm (FrWIn) Apex length mm (PL) Anther width mm (AW) Stigma remained mm (StygR) Midrib width at abaxial side mm (MW) Stamen length mm (STL) Basal discoloration mm (BsDis) Vein angle degree (VA) Filament length mm (FL) Stone mm (Stn) Venation density (VD) Style length mm (STYL) Corky mm (CRK) Gynoecium length mm (GL) Ovary length mm (OL) Stigma width mm (SL)
41 REFERENCES Claude, Julien Morphometrics with R. Springer. Gotelli, Nicholas J., and Aaron M. Ellison A primer of ecological statistics. Sinauer Associates Publishers. Jolliffe, I. T Principal component analysis. Springer. Legendre, Pierre, and Louis Legendre Numerical ecology. Elsevier. Q i G ldp d Mi h l J K h 2002 E i l Quinn, Gerald Peter, and Michael J. Keough Experimental design and data analysis for biologists. Cambridge University Press.
-Principal components analysis is by far the oldest multivariate technique, dating back to the early 1900's; ecologists have used PCA since the
1 2 3 -Principal components analysis is by far the oldest multivariate technique, dating back to the early 1900's; ecologists have used PCA since the 1950's. -PCA is based on covariance or correlation
More informationMultivariate Statistics 101. Ordination (PCA, NMDS, CA) Cluster Analysis (UPGMA, Ward s) Canonical Correspondence Analysis
Multivariate Statistics 101 Ordination (PCA, NMDS, CA) Cluster Analysis (UPGMA, Ward s) Canonical Correspondence Analysis Multivariate Statistics 101 Copy of slides and exercises PAST software download
More informationIntroduction to multivariate analysis Outline
Introduction to multivariate analysis Outline Why do a multivariate analysis Ordination, classification, model fitting Principal component analysis Discriminant analysis, quickly Species presence/absence
More informationExperimental Design and Data Analysis for Biologists
Experimental Design and Data Analysis for Biologists Gerry P. Quinn Monash University Michael J. Keough University of Melbourne CAMBRIDGE UNIVERSITY PRESS Contents Preface page xv I I Introduction 1 1.1
More informationAlgebra of Principal Component Analysis
Algebra of Principal Component Analysis 3 Data: Y = 5 Centre each column on its mean: Y c = 7 6 9 y y = 3..6....6.8 3. 3.8.6 Covariance matrix ( variables): S = -----------Y n c ' Y 8..6 c =.6 5.8 Equation
More informationINTRODUCTION TO MULTIVARIATE ANALYSIS OF ECOLOGICAL DATA
INTRODUCTION TO MULTIVARIATE ANALYSIS OF ECOLOGICAL DATA David Zelený & Ching-Feng Li INTRODUCTION TO MULTIVARIATE ANALYSIS Ecologial similarity similarity and distance indices Gradient analysis regression,
More informationBIO 682 Multivariate Statistics Spring 2008
BIO 682 Multivariate Statistics Spring 2008 Steve Shuster http://www4.nau.edu/shustercourses/bio682/index.htm Lecture 11 Properties of Community Data Gauch 1982, Causton 1988, Jongman 1995 a. Qualitative:
More informationUnconstrained Ordination
Unconstrained Ordination Sites Species A Species B Species C Species D Species E 1 0 (1) 5 (1) 1 (1) 10 (4) 10 (4) 2 2 (3) 8 (3) 4 (3) 12 (6) 20 (6) 3 8 (6) 20 (6) 10 (6) 1 (2) 3 (2) 4 4 (5) 11 (5) 8 (5)
More informationEXAM PRACTICE. 12 questions * 4 categories: Statistics Background Multivariate Statistics Interpret True / False
EXAM PRACTICE 12 questions * 4 categories: Statistics Background Multivariate Statistics Interpret True / False Stats 1: What is a Hypothesis? A testable assertion about how the world works Hypothesis
More informationChapter 11 Canonical analysis
Chapter 11 Canonical analysis 11.0 Principles of canonical analysis Canonical analysis is the simultaneous analysis of two, or possibly several data tables. Canonical analyses allow ecologists to perform
More informationMultivariate Data Analysis a survey of data reduction and data association techniques: Principal Components Analysis
Multivariate Data Analysis a survey of data reduction and data association techniques: Principal Components Analysis For example Data reduction approaches Cluster analysis Principal components analysis
More informationOrdination & PCA. Ordination. Ordination
Ordination & PCA Introduction to Ordination Purpose & types Shepard diagrams Principal Components Analysis (PCA) Properties Computing eigenvalues Computing principal components Biplots Covariance vs. Correlation
More informationMultivariate Analysis of Ecological Data using CANOCO
Multivariate Analysis of Ecological Data using CANOCO JAN LEPS University of South Bohemia, and Czech Academy of Sciences, Czech Republic Universitats- uric! Lanttesbibiiothek Darmstadt Bibliothek Biologie
More informationPrincipal Component Analysis
I.T. Jolliffe Principal Component Analysis Second Edition With 28 Illustrations Springer Contents Preface to the Second Edition Preface to the First Edition Acknowledgments List of Figures List of Tables
More information4. Ordination in reduced space
Université Laval Analyse multivariable - mars-avril 2008 1 4.1. Generalities 4. Ordination in reduced space Contrary to most clustering techniques, which aim at revealing discontinuities in the data, ordination
More informationPrincipal component analysis
Principal component analysis Motivation i for PCA came from major-axis regression. Strong assumption: single homogeneous sample. Free of assumptions when used for exploration. Classical tests of significance
More informationPrincipal Components Analysis. Sargur Srihari University at Buffalo
Principal Components Analysis Sargur Srihari University at Buffalo 1 Topics Projection Pursuit Methods Principal Components Examples of using PCA Graphical use of PCA Multidimensional Scaling Srihari 2
More information4/2/2018. Canonical Analyses Analysis aimed at identifying the relationship between two multivariate datasets. Cannonical Correlation.
GAL50.44 0 7 becki 2 0 chatamensis 0 darwini 0 ephyppium 0 guntheri 3 0 hoodensis 0 microphyles 0 porteri 2 0 vandenburghi 0 vicina 4 0 Multiple Response Variables? Univariate Statistics Questions Individual
More informationAn Introduction to Ordination Connie Clark
An Introduction to Ordination Connie Clark Ordination is a collective term for multivariate techniques that adapt a multidimensional swarm of data points in such a way that when it is projected onto a
More information4/4/2018. Stepwise model fitting. CCA with first three variables only Call: cca(formula = community ~ env1 + env2 + env3, data = envdata)
0 Correlation matrix for ironmental matrix 1 2 3 4 5 6 7 8 9 10 11 12 0.087451 0.113264 0.225049-0.13835 0.338366-0.01485 0.166309-0.11046 0.088327-0.41099-0.19944 1 1 2 0.087451 1 0.13723-0.27979 0.062584
More informationPrincipal Component Analysis (PCA) Theory, Practice, and Examples
Principal Component Analysis (PCA) Theory, Practice, and Examples Data Reduction summarization of data with many (p) variables by a smaller set of (k) derived (synthetic, composite) variables. p k n A
More informationEigenvalues, Eigenvectors, and an Intro to PCA
Eigenvalues, Eigenvectors, and an Intro to PCA Eigenvalues, Eigenvectors, and an Intro to PCA Changing Basis We ve talked so far about re-writing our data using a new set of variables, or a new basis.
More informationIntroduction to ordination. Gary Bradfield Botany Dept.
Introduction to ordination Gary Bradfield Botany Dept. Ordination there appears to be no word in English which one can use as an antonym to classification ; I would like to propose the term ordination.
More informationEigenvalues, Eigenvectors, and an Intro to PCA
Eigenvalues, Eigenvectors, and an Intro to PCA Eigenvalues, Eigenvectors, and an Intro to PCA Changing Basis We ve talked so far about re-writing our data using a new set of variables, or a new basis.
More information8. FROM CLASSICAL TO CANONICAL ORDINATION
Manuscript of Legendre, P. and H. J. B. Birks. 2012. From classical to canonical ordination. Chapter 8, pp. 201-248 in: Tracking Environmental Change using Lake Sediments, Volume 5: Data handling and numerical
More informationMultivariate Statistics Summary and Comparison of Techniques. Multivariate Techniques
Multivariate Statistics Summary and Comparison of Techniques P The key to multivariate statistics is understanding conceptually the relationship among techniques with regards to: < The kinds of problems
More informationFactor analysis. George Balabanis
Factor analysis George Balabanis Key Concepts and Terms Deviation. A deviation is a value minus its mean: x - mean x Variance is a measure of how spread out a distribution is. It is computed as the average
More informationStructure in Data. A major objective in data analysis is to identify interesting features or structure in the data.
Structure in Data A major objective in data analysis is to identify interesting features or structure in the data. The graphical methods are very useful in discovering structure. There are basically two
More informationFigure 43 - The three components of spatial variation
Université Laval Analyse multivariable - mars-avril 2008 1 6.3 Modeling spatial structures 6.3.1 Introduction: the 3 components of spatial structure For a good understanding of the nature of spatial variation,
More informationAnalysis of Multivariate Ecological Data
Analysis of Multivariate Ecological Data School on Recent Advances in Analysis of Multivariate Ecological Data 24-28 October 2016 Prof. Pierre Legendre Dr. Daniel Borcard Département de sciences biologiques
More informationVarCan (version 1): Variation Estimation and Partitioning in Canonical Analysis
VarCan (version 1): Variation Estimation and Partitioning in Canonical Analysis Pedro R. Peres-Neto March 2005 Department of Biology University of Regina Regina, SK S4S 0A2, Canada E-mail: Pedro.Peres-Neto@uregina.ca
More informationDIMENSION REDUCTION AND CLUSTER ANALYSIS
DIMENSION REDUCTION AND CLUSTER ANALYSIS EECS 833, 6 March 2006 Geoff Bohling Assistant Scientist Kansas Geological Survey geoff@kgs.ku.edu 864-2093 Overheads and resources available at http://people.ku.edu/~gbohling/eecs833
More informationDiversity partitioning without statistical independence of alpha and beta
1964 Ecology, Vol. 91, No. 7 Ecology, 91(7), 2010, pp. 1964 1969 Ó 2010 by the Ecological Society of America Diversity partitioning without statistical independence of alpha and beta JOSEPH A. VEECH 1,3
More informationMultivariate Statistics (I) 2. Principal Component Analysis (PCA)
Multivariate Statistics (I) 2. Principal Component Analysis (PCA) 2.1 Comprehension of PCA 2.2 Concepts of PCs 2.3 Algebraic derivation of PCs 2.4 Selection and goodness-of-fit of PCs 2.5 Algebraic derivation
More information1. Introduction to Multivariate Analysis
1. Introduction to Multivariate Analysis Isabel M. Rodrigues 1 / 44 1.1 Overview of multivariate methods and main objectives. WHY MULTIVARIATE ANALYSIS? Multivariate statistical analysis is concerned with
More informationVector Space Models. wine_spectral.r
Vector Space Models 137 wine_spectral.r Latent Semantic Analysis Problem with words Even a small vocabulary as in wine example is challenging LSA Reduce number of columns of DTM by principal components
More informationMultivariate analysis of genetic data an introduction
Multivariate analysis of genetic data an introduction Thibaut Jombart MRC Centre for Outbreak Analysis and Modelling Imperial College London Population genomics in Lausanne 23 Aug 2016 1/25 Outline Multivariate
More informationRigid rotation of nonmetric multidimensional scaling axes to environmental congruence
Ab~tracta Batanica 14:100-110, 1000 Department of Plant Taonomy and Ecology, ELTE. Budapeat Rigid rotation of nonmetric multidimensional scaling aes to environmental congruence N.C. Kenkel and C.E. Burchill
More informationPrinciple Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA
Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA Principle Components Analysis: Uses one group of variables (we will call this X) In
More informationDETECTING BIOLOGICAL AND ENVIRONMENTAL CHANGES: DESIGN AND ANALYSIS OF MONITORING AND EXPERIMENTS (University of Bologna, 3-14 March 2008)
Dipartimento di Biologia Evoluzionistica Sperimentale Centro Interdipartimentale di Ricerca per le Scienze Ambientali in Ravenna INTERNATIONAL WINTER SCHOOL UNIVERSITY OF BOLOGNA DETECTING BIOLOGICAL AND
More informationCanonical Correlation & Principle Components Analysis
Canonical Correlation & Principle Components Analysis Aaron French Canonical Correlation Canonical Correlation is used to analyze correlation between two sets of variables when there is one set of IVs
More informationIntroduction to Machine Learning
10-701 Introduction to Machine Learning PCA Slides based on 18-661 Fall 2018 PCA Raw data can be Complex, High-dimensional To understand a phenomenon we measure various related quantities If we knew what
More informationBootstrapping, Randomization, 2B-PLS
Bootstrapping, Randomization, 2B-PLS Statistics, Tests, and Bootstrapping Statistic a measure that summarizes some feature of a set of data (e.g., mean, standard deviation, skew, coefficient of variation,
More informationANOVA approach. Investigates interaction terms. Disadvantages: Requires careful sampling design with replication
ANOVA approach Advantages: Ideal for evaluating hypotheses Ideal to quantify effect size (e.g., differences between groups) Address multiple factors at once Investigates interaction terms Disadvantages:
More informationSTATISTICAL LEARNING SYSTEMS
STATISTICAL LEARNING SYSTEMS LECTURE 8: UNSUPERVISED LEARNING: FINDING STRUCTURE IN DATA Institute of Computer Science, Polish Academy of Sciences Ph. D. Program 2013/2014 Principal Component Analysis
More informationLecture 2: Diversity, Distances, adonis. Lecture 2: Diversity, Distances, adonis. Alpha- Diversity. Alpha diversity definition(s)
Lecture 2: Diversity, Distances, adonis Lecture 2: Diversity, Distances, adonis Diversity - alpha, beta (, gamma) Beta- Diversity in practice: Ecological Distances Unsupervised Learning: Clustering, etc
More informationPCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani
PCA & ICA CE-717: Machine Learning Sharif University of Technology Spring 2015 Soleymani Dimensionality Reduction: Feature Selection vs. Feature Extraction Feature selection Select a subset of a given
More informationRevision: Chapter 1-6. Applied Multivariate Statistics Spring 2012
Revision: Chapter 1-6 Applied Multivariate Statistics Spring 2012 Overview Cov, Cor, Mahalanobis, MV normal distribution Visualization: Stars plot, mosaic plot with shading Outlier: chisq.plot Missing
More information1.3. Principal coordinate analysis. Pierre Legendre Département de sciences biologiques Université de Montréal
1.3. Pierre Legendre Département de sciences biologiques Université de Montréal http://www.numericalecology.com/ Pierre Legendre 2018 Definition of principal coordinate analysis (PCoA) An ordination method
More informationPrincipal component analysis, PCA
CHEM-E3205 Bioprocess Optimization and Simulation Principal component analysis, PCA Tero Eerikäinen Room D416d tero.eerikainen@aalto.fi Data Process or system measurements New information from the gathered
More informationGlossary. The ISI glossary of statistical terms provides definitions in a number of different languages:
Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the
More informationsphericity, 5-29, 5-32 residuals, 7-1 spread and level, 2-17 t test, 1-13 transformations, 2-15 violations, 1-19
additive tree structure, 10-28 ADDTREE, 10-51, 10-53 EXTREE, 10-31 four point condition, 10-29 ADDTREE, 10-28, 10-51, 10-53 adjusted R 2, 8-7 ALSCAL, 10-49 ANCOVA, 9-1 assumptions, 9-5 example, 9-7 MANOVA
More informationPrincipal Component Analysis. Applied Multivariate Statistics Spring 2012
Principal Component Analysis Applied Multivariate Statistics Spring 2012 Overview Intuition Four definitions Practical examples Mathematical example Case study 2 PCA: Goals Goal 1: Dimension reduction
More informationMaximum variance formulation
12.1. Principal Component Analysis 561 Figure 12.2 Principal component analysis seeks a space of lower dimensionality, known as the principal subspace and denoted by the magenta line, such that the orthogonal
More informationMultivariate Analysis of Ecological Data
Multivariate Analysis of Ecological Data MICHAEL GREENACRE Professor of Statistics at the Pompeu Fabra University in Barcelona, Spain RAUL PRIMICERIO Associate Professor of Ecology, Evolutionary Biology
More informationMultivariate analysis of genetic data: an introduction
Multivariate analysis of genetic data: an introduction Thibaut Jombart MRC Centre for Outbreak Analysis and Modelling Imperial College London XXIV Simposio Internacional De Estadística Bogotá, 25th July
More informationAnalysis of Multivariate Ecological Data
Analysis of Multivariate Ecological Data School on Recent Advances in Analysis of Multivariate Ecological Data 24-28 October 2016 Prof. Pierre Legendre Dr. Daniel Borcard Département de sciences biologiques
More informationStatistics 202: Data Mining. c Jonathan Taylor. Week 2 Based in part on slides from textbook, slides of Susan Holmes. October 3, / 1
Week 2 Based in part on slides from textbook, slides of Susan Holmes October 3, 2012 1 / 1 Part I Other datatypes, preprocessing 2 / 1 Other datatypes Document data You might start with a collection of
More informationPart I. Other datatypes, preprocessing. Other datatypes. Other datatypes. Week 2 Based in part on slides from textbook, slides of Susan Holmes
Week 2 Based in part on slides from textbook, slides of Susan Holmes Part I Other datatypes, preprocessing October 3, 2012 1 / 1 2 / 1 Other datatypes Other datatypes Document data You might start with
More information1 A factor can be considered to be an underlying latent variable: (a) on which people differ. (b) that is explained by unknown variables
1 A factor can be considered to be an underlying latent variable: (a) on which people differ (b) that is explained by unknown variables (c) that cannot be defined (d) that is influenced by observed variables
More information6348 Final, Fall 14. Closed book, closed notes, no electronic devices. Points (out of 200) in parentheses.
6348 Final, Fall 14. Closed book, closed notes, no electronic devices. Points (out of 200) in parentheses. 0 11 1 1.(5) Give the result of the following matrix multiplication: 1 10 1 Solution: 0 1 1 2
More informationCovariance and Principal Components
COMP3204/COMP6223: Computer Vision Covariance and Principal Components Jonathon Hare jsh2@ecs.soton.ac.uk Variance and Covariance Random Variables and Expected Values Mathematicians talk variance (and
More informationFace Recognition and Biometric Systems
The Eigenfaces method Plan of the lecture Principal Components Analysis main idea Feature extraction by PCA face recognition Eigenfaces training feature extraction Literature M.A.Turk, A.P.Pentland Face
More informationLecture 5: Ecological distance metrics; Principal Coordinates Analysis. Univariate testing vs. community analysis
Lecture 5: Ecological distance metrics; Principal Coordinates Analysis Univariate testing vs. community analysis Univariate testing deals with hypotheses concerning individual taxa Is this taxon differentially
More informationNONLINEAR REDUNDANCY ANALYSIS AND CANONICAL CORRESPONDENCE ANALYSIS BASED ON POLYNOMIAL REGRESSION
Ecology, 8(4),, pp. 4 by the Ecological Society of America NONLINEAR REDUNDANCY ANALYSIS AND CANONICAL CORRESPONDENCE ANALYSIS BASED ON POLYNOMIAL REGRESSION VLADIMIR MAKARENKOV, AND PIERRE LEGENDRE, Département
More informationPRINCIPAL COMPONENT ANALYSIS
PRINCIPAL COMPONENT ANALYSIS 1 INTRODUCTION One of the main problems inherent in statistics with more than two variables is the issue of visualising or interpreting data. Fortunately, quite often the problem
More information1 Interpretation. Contents. Biplots, revisited. Biplots, revisited 2. Biplots, revisited 1
Biplots, revisited 1 Biplots, revisited 2 1 Interpretation Biplots, revisited Biplots show the following quantities of a data matrix in one display: Slide 1 Ulrich Kohler kohler@wz-berlin.de Slide 3 the
More informationLecture 5: Ecological distance metrics; Principal Coordinates Analysis. Univariate testing vs. community analysis
Lecture 5: Ecological distance metrics; Principal Coordinates Analysis Univariate testing vs. community analysis Univariate testing deals with hypotheses concerning individual taxa Is this taxon differentially
More informationDissimilarity and transformations. Pierre Legendre Département de sciences biologiques Université de Montréal
and transformations Pierre Legendre Département de sciences biologiques Université de Montréal http://www.numericalecology.com/ Pierre Legendre 2017 Definitions An association coefficient is a function
More informationData Screening and Adjustments. Data Screening for Errors
Purpose: ata Screening and djustments P etect and correct data errors P etect and treat missing data P etect and handle insufficiently sampled variables (e.g., rare species) P onduct transformations and
More informationROBERTO BATTITI, MAURO BRUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Apr 2015
ROBERTO BATTITI, MAURO BRUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Apr 2015 http://intelligentoptimization.org/lionbook Roberto Battiti
More informationLeast Squares Optimization
Least Squares Optimization The following is a brief review of least squares optimization and constrained optimization techniques, which are widely used to analyze and visualize data. Least squares (LS)
More informationISSN: (Online) Volume 3, Issue 5, May 2015 International Journal of Advance Research in Computer Science and Management Studies
ISSN: 2321-7782 (Online) Volume 3, Issue 5, May 2015 International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online at:
More information7 Principal Components and Factor Analysis
7 Principal Components and actor nalysis 7.1 Principal Components a oal. Relationships between two variables can be graphically well captured in a meaningful way. or three variables this is also possible,
More informationExperimental design. Matti Hotokka Department of Physical Chemistry Åbo Akademi University
Experimental design Matti Hotokka Department of Physical Chemistry Åbo Akademi University Contents Elementary concepts Regression Validation Hypotesis testing ANOVA PCA, PCR, PLS Clusters, SIMCA Design
More informationBasics of Multivariate Modelling and Data Analysis
Basics of Multivariate Modelling and Data Analysis Kurt-Erik Häggblom 6. Principal component analysis (PCA) 6.1 Overview 6.2 Essentials of PCA 6.3 Numerical calculation of PCs 6.4 Effects of data preprocessing
More informationFACTOR ANALYSIS AND MULTIDIMENSIONAL SCALING
FACTOR ANALYSIS AND MULTIDIMENSIONAL SCALING Vishwanath Mantha Department for Electrical and Computer Engineering Mississippi State University, Mississippi State, MS 39762 mantha@isip.msstate.edu ABSTRACT
More informationStatistics: A review. Why statistics?
Statistics: A review Why statistics? What statistical concepts should we know? Why statistics? To summarize, to explore, to look for relations, to predict What kinds of data exist? Nominal, Ordinal, Interval
More informationShort Answer Questions: Answer on your separate blank paper. Points are given in parentheses.
ISQS 6348 Final exam solutions. Name: Open book and notes, but no electronic devices. Answer short answer questions on separate blank paper. Answer multiple choice on this exam sheet. Put your name on
More informationBasics of Multivariate Modelling and Data Analysis
Basics of Multivariate Modelling and Data Analysis Kurt-Erik Häggblom 2. Overview of multivariate techniques 2.1 Different approaches to multivariate data analysis 2.2 Classification of multivariate techniques
More informationDimension Reduction Techniques. Presented by Jie (Jerry) Yu
Dimension Reduction Techniques Presented by Jie (Jerry) Yu Outline Problem Modeling Review of PCA and MDS Isomap Local Linear Embedding (LLE) Charting Background Advances in data collection and storage
More informationAdvising on Research Methods: A consultant's companion. Herman J. Ader Gideon J. Mellenbergh with contributions by David J. Hand
Advising on Research Methods: A consultant's companion Herman J. Ader Gideon J. Mellenbergh with contributions by David J. Hand Contents Preface 13 I Preliminaries 19 1 Giving advice on research methods
More informationPCA Advanced Examples & Applications
PCA Advanced Examples & Applications Objectives: Showcase advanced PCA analysis: - Addressing the assumptions - Improving the signal / decreasing the noise Principal Components (PCA) Paper II Example:
More informationVariations in pelagic bacterial communities in the North Atlantic Ocean coincide with water bodies
The following supplement accompanies the article Variations in pelagic bacterial communities in the North Atlantic Ocean coincide with water bodies Richard L. Hahnke 1, Christina Probian 1, Bernhard M.
More informationAnalyse canonique, partition de la variation et analyse CPMV
Analyse canonique, partition de la variation et analyse CPMV Legendre, P. 2005. Analyse canonique, partition de la variation et analyse CPMV. Sémin-R, atelier conjoint GREFi-CRBF d initiation au langage
More informationI L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN
Introduction Edps/Psych/Stat/ 584 Applied Multivariate Statistics Carolyn J Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN c Board of Trustees,
More information7. Variable extraction and dimensionality reduction
7. Variable extraction and dimensionality reduction The goal of the variable selection in the preceding chapter was to find least useful variables so that it would be possible to reduce the dimensionality
More informationLab 7. Direct & Indirect Gradient Analysis
Lab 7 Direct & Indirect Gradient Analysis Direct and indirect gradient analysis refers to a case where you have two datasets with variables that have cause-and-effect or mutual influences on each other.
More informationGEOMETRIC MORPHOMETRICS. Adrian Castellanos, Michelle Chrpa, & Pedro Afonso Leite
GEOMETRIC MORPHOMETRICS Adrian Castellanos, Michelle Chrpa, & Pedro Afonso Leite WHAT IS MORPHOMETRICS? Quantitative analysis of form, a concept that encompasses size and shape. Analyses performed on live
More informationOverview of clustering analysis. Yuehua Cui
Overview of clustering analysis Yuehua Cui Email: cuiy@msu.edu http://www.stt.msu.edu/~cui A data set with clear cluster structure How would you design an algorithm for finding the three clusters in this
More informationCommunity surveys through space and time: testing the space time interaction
Suivi spatio-temporel des écosystèmes : tester l'interaction espace-temps pour identifier les impacts sur les communautés Community surveys through space and time: testing the space time interaction Pierre
More informationEigenvalues, Eigenvectors, and an Intro to PCA
Eigenvalues, Eigenvectors, and an Intro to PCA Eigenvalues, Eigenvectors, and an Intro to PCA Changing Basis We ve talked so far about re-writing our data using a new set of variables, or a new basis.
More informationPHENETIC STUDIES OF ATROPA SPECIES IN IRAN
PHENETIC STUDIES OF ATROPA SPECIES IN IRAN M. Sheidai, M. Khatamsaz, & M. Goldasteh Sheidai, M., Khatamsaz, M. & Goldasteh, M. 005: Phenetic studies of Atropa species in Iran. -Iran. Journ. Bot. 9(1):
More informationSTAT 730 Chapter 14: Multidimensional scaling
STAT 730 Chapter 14: Multidimensional scaling Timothy Hanson Department of Statistics, University of South Carolina Stat 730: Multivariate Data Analysis 1 / 16 Basic idea We have n objects and a matrix
More informationLeast Squares Optimization
Least Squares Optimization The following is a brief review of least squares optimization and constrained optimization techniques. Broadly, these techniques can be used in data analysis and visualization
More informationData Preprocessing Tasks
Data Tasks 1 2 3 Data Reduction 4 We re here. 1 Dimensionality Reduction Dimensionality reduction is a commonly used approach for generating fewer features. Typically used because too many features can
More informationDimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes. October 3, Statistics 202: Data Mining
Dimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes October 3, 2012 1 / 1 Combinations of features Given a data matrix X n p with p fairly large, it can
More informationEach copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.
Effects of Sample Distribution along Gradients on Eigenvector Ordination Author(s): C. L. Mohler Source: Vegetatio, Vol. 45, No. 3 (Jul. 31, 1981), pp. 141-145 Published by: Springer Stable URL: http://www.jstor.org/stable/20037040.
More informationQuantitative Understanding in Biology Principal Components Analysis
Quantitative Understanding in Biology Principal Components Analysis Introduction Throughout this course we have seen examples of complex mathematical phenomena being represented as linear combinations
More informationTHE PRINCIPLES AND PRACTICE OF STATISTICS IN BIOLOGICAL RESEARCH. Robert R. SOKAL and F. James ROHLF. State University of New York at Stony Brook
BIOMETRY THE PRINCIPLES AND PRACTICE OF STATISTICS IN BIOLOGICAL RESEARCH THIRD E D I T I O N Robert R. SOKAL and F. James ROHLF State University of New York at Stony Brook W. H. FREEMAN AND COMPANY New
More information