Abstract. 1 Introduction
|
|
- Junior Malone
- 5 years ago
- Views:
Transcription
1 Comparisons between Q-R-mode methods in the study of distribution of recent facies in the sea floors of the Cadiz bay Gonzalez-Caballero, J.L.* & Gutierrez-Mas, J.M.**. * Dpto. de Matemdticas. Universidad de Cadiz de Abstract In this work we compare the outcomes obtained from three mnltivariate statistic methods: the Q-mode factor analysis, the technique of biplot representation and the correspondence analysis applied to the study of the recent sediments of the sea floor of the Cadiz bay. The results obtained with these methods have allowed discrimination between types of sediments through the establishment of different mineralogical associations defined as fades, as well as the elaborate models of sedimentary processes, aspects of great interest for knowledge of the behaviour of the coast marine environment. 1 Introduction The Singular Value Decomposition (SVD) of a X,^ matrix, n objects and p variables, allows us to write X using the principal directions obtained in the R? and R spaces, where the row and column vectors of the X matrix can be represented, respectively. When sediments of the sea floor are extracted and properties of mineralogical composition or of grain size distribution are studied, the picked up data are of compositional type, that is to say, the values of the variables for each sample add a constant quantity (generally 100), Reyment et al/ suggest, among other methods, the use of the
2 4 Applied Sciences and the Environment Q-mode factor analysis, or 'inverted' factor analysis, proposed by Inibrie and Pnrdy^. The fact that the Q-mode factor analysis for a X matrix is a type of factor analysis of a similarities Q-matrix among the X rows of the type XX', allows us to express it in terms of the SVD of X, obtaining the factor loadings starting from a scaling of the eigenvectors of XX' and the factors scores starting from the autovectores of X'X. But the SVD is the foundation of many reduction techniques and data representation, among them the technique of biplot representation and the correspondence analysis, both also suggested by Reyment et al/ so that they are used in the treatment of this type of geologic data, under the denomination of Q-R-mode methods. In this comunication we compare the outcomes obtained starting from the Q-mode factor analysis with the previous Q-R-mode methods, in the study of the distribution of recent sediments in different sectors of the sea floors of the Cadiz bay, with object of checking the effectiveness of the same ones in the description and classification of the sediments. 2 The Q-mode factor analysis, the biplot and the correspondence analysis. The SVD of a X?^ matrix of range r (< p < n) allows us to descompose X as X = VAU' (1) where A diag(ai,..., A,-) is a diagonal matrix with Af,..., A^ the positive eigenvalues of X'X, U = [HI,..., u?.] the matrix of eigenvectors ortonormal of X'X and V = [vi,...,v^] is the matrix of eigenvectors ortonormal of XX', corresponding to the eigenvalues Af,..., A^. Their statistical importance is due to Eckart and Young* and Householder and Young^ that showed that if AI > A 2 >... > A?., the best fit in the sense of least squares of the X matrix for one of range q(< r) comes given by the (n,q) matrix X(9) = V(9)A(9)"U(r/) (2) taking the qfirsteigenvalues and eigenvectors, and an absolute measure of the kindness of this fit can be defined for the proximity to 1
3 Applied Sciences and the Environment The Q-mode factor analysis of Imbrie and Purdy. The Q-factor model supposes that each sediment can be exprcssed, approximately, as a linear combination of g(< min(n,p)) factors (patron sediments), so that where A, F and E represent, respectively, the factor loadings matrix, the factor scores matrix and the residual matrix of non explained part by the model. The decomposition (1) allows us to obtain the model (3) without taking more than A = V% A^ and F = U^. Although such a decomposition also allows other alternatives, the previous one is usually taken because the factor loadings are directly comparable on being the factors F'F = 1%. Imbrie and Purdy^ intended to apply the model (3) to analyse geological problems involving compositional data, defining the 'index of proportional similarity' like the coefficient that is, applying the model (3) to the matrix W = D~^X, where D is an (n,n) diagonal matrix of the row sum squares of X. On the other hand, the same analytic criteria that are used in the general factor model to carry out rotations, orthogonal or oblique, inspired by those TVmraZonc cnferm o/,9%my?k afn/cfmre to obtain a simpler factor structure, they can be used in the Q-factor model. 2.2 The representation biplot model. The biplot is a representation model also based on the decomposition (1) that was introduced by Gabriel^ Given a data X,^ matrix, the biplot provides to combined, exact or approximate plot according to the range of X, of the n objects and the p variables in two dimensions. For that, from the decomposition (1), we can to define the matrix wiht Af,..., A? the elements of A* and similary for A'~", that allow us to express X as H' (6)
4 6 Applied Sciences and the Environment being the g; and hj vectors, with r components, formed by the row of G and H respectively. The least squares properties of the decomposition (1) allow us to obtain an approximate representation of X in a plane, taking thefirsttwo components of g% and hj, denominated biplot of X. For a = 1 one has G = V(2) A^) and H' = UL\, denominated component principal biplot. It verifies that H'H = 1% (notices you the equivalence of G and H with A and F for q 2 in the Q- factor model), and also XX' = GG', that is to say, the relationships among the rows of X in relation to the euclidean metric, they can be represented by those of the g^ vectors with the same metric. 2.3 The correspondence analysis. The technique denominated correspondence analysis (Greenacre^) is usually a procedure that allows to obtain a particular graphic representation of the rows and columns of a non negative data matrix. Consider a X^p matrix of TV observations arranged in a two-way contingency table, where the n rows and the p columns represent, respectively, the n and p categories of two discret variables, with Xij denoting the number of observations which, respectively, take the ith categoric (i = l,---,n) for the first variable and the jth categoric (j = l,--.,p) for the second variable. The procedure is based on to obtainfirsta matrix R = P - fc', where P = (1/7V)X, f = P Ip, c P'lra, and Ip, In are vectors of p and n elements respectively with all elements unity. In fact, the R matrix it is a residual matrix one that it is obtained when the independence model isfittedto P If we define the diagonal matrix (n, n) Dy whose elements are those of the f vector and the diagonal matrix (p, p) DC with elements _ 1/2 _ 1/2 those of the c vector, the transformation Z = Dy RDc allows us to rescale of different forms the rows and the columns of R, according to the inverse of the square root of the total rows and columns, respectively, that which homogenizes the scale of rows and columns of R. Again, the SVD given in (1) of the Z matrix, taking thefirsttwo singular values and associate eigenvectors, allows us to obtain some coordinates in the euclidean plane for the rows and columns of R, that are rescaled versions of the principal components of Z, given for: (7) The expressions of (7), as well as the relationship that exists
5 Applied Sciences and the Environment 1 among F and C, allow us to interpret in the plot that the proximity among points that represent the rows of X reveal the same behaviour in relation to the columns, and reciprocally for the points that represent the columns. Also, the proximity of the column points to a row point reveals the influence of those in this. Lastly, the proximity of a point (row or column) to a certain axis expresses their contribution in the definition of the axis, being biggest all that more it moves away from the center of the representation. 3 Analysis and discussion of recent sediments of the Cadiz bay. With the purpose of comparing the results obtained in each one of the three procedures from section 2, a sampling of sediments obtained in 35 stations from the sea floor of the Cadiz bay has been analysed. Two types of properties have been measured in these samples, on one hand the composition, through compositional variables like the content in Quartz (QU), Feldspars (FE), Phyllosilicates (PH) and Carbonates (CA), whose sum of values for each sediment is the 100%. For other the granulometric nature has been measured, that is to say, the grain size distribution, measured by the size fraction Gravel (GR), Sand (SA) and Mud (MU) content, whose sum is also the 100%. The relationships we find among the variables of both characteristics will be those which best define sediment types (fades) on the sea floors of the Cadiz bay. From a descriptive analysis of the variables measured, we can to deduce that the Quartz is the most abundant component together with the Sand, constituting the most permanent and characteristic properties of the sample carried out. The results of the Q-mode factor model described in the section 2 for the W matrix, give us factor loadings and factor scores, respectively, in the Tables 1 and 2 of the initial factors and rotated factors by the varimax procedure given. On analysing the initial factor scores, we can carried out two sedimentary fades, one as the majority (Factor 1) that explains 96 % of the variability, predominant in the area considering the high loadings that this factor has in all the sediments. This fades can be described as bioclastic quarziferous sand and represents the dominion of the traction transport, as bottom load and associated with the
6 8 Applied Sciences and the Environment highest energy processes that takes place on the sea floor. The other fades (Factor 2), can be described as quartziferous biodastic mud, only predominant in some sectors, and represents the dominion of the suspension load transport which require less energy. Sedim. SI S2 S3 S4 S S8 SO S10 Sll S S14 S15 S16 S17 S18 S19 S S22 S S Expl. Var. % of total Initial solution Factor loadings Factor 1 Factor MO G , Table 1: Factor loadings for the initial rotated orthogonal solution (right)from Rotated solution Factor loadings Factor 1 Factor solution (left) and Q-mode factor analysis. In Figure 1 it can be appreciated how almost all the sediments are in a range of values of high factor loadings (between 0.8 and 1) for the Factor 1, while for the Factor 2 a great part of them take values of negative factor loadings (between -0.2 and 0), other take low positive values (between 0 and 0.3) and only some few relatively high positive
7 Applied Sciences and the Environment values (between 0.3 and 0,6). The analysis results indicates the coexistence of the two processes that have given rise to the formation of these sediments. The dominant process is of character tractive and associated to the high energy in the sea floor environment, is the one that has given rise to those WocWz'c g%on%/ero%s aorw. Later, the sediments deposited as consequence of this process, they should be reworking in a low energetic environment, more quited, deep and far of the coast, giving rise to the precipitation of fine sizes (Mud) from the suspension. Var. QUAR. FELD. PHYL. GARB. GRAV. SAND MUD Initial solution Factor scores Factor 1 Factor Rotated solution Factor scores Factor 1 Factor Table 2: Factor scores for the initial solution (left) and rotated orthogonal solution (right) from Q-rnode factor analysis. Figure 1: Initial factor loadings 0, ,2 0,0-0,2-0.4 S15 S1 S30 S11 S20; 0,0 0, This last process altered the properties of the sediments preexisting, especially the grain size distribution and its influence varies of some areas to other, depending on the geographical situation of the sampling stations in relation with the agent of transport. So, the negatives factor loadings for the Factor 2 correspond to the areas little or anything affected by this process, while the growing positive values indicate the areas of more influence.
8 10 Applied Sciences and the Environment As for the analysis of the factor scores and loadings obtained after rotating orthogonaly with the variamax procedure, the results show how they are clarified even more the fades when the variabiability explained is distributed in 61 % of the Factor 1 in front of the 39 % of the Factor 2. Axis 2* (3,91%) 1,U 0, ,4 0,2 0,0 0,2-0,4 MU + Figure 2: Principal components biplot S15 SI S30 CA + Sll PH Q"»3 ';» + FE " + sao': / SA + 0,0 0,2 0,4 0, Axis 1*( 96.09%) 1,2 1,0 0, ,4 0, n A Figure 3: Plot of rows and columns with correspondence analysis OR S1 S33S9 CA 4- ^ * S30- S1l7 FE t S20 SA.. '< ' Row.Coords The results analysed with the Q-factor model can be obtained with a biggest clarity with the Figure 2, obtained with the representation of the principal components biplot for the same W matrix. In it, besides being appreciated how as much the factor loadings as the factor scores can be obtained for the two factor solutions obtained without more than to rotate the axes orthogonaly. We can observe the almost complete domain of the fades quarziferous sand, with
9 Applied Sciences and the Environment 11 prevalence of the variable Sand like main describer property followed of the Quartz, in the samples of the low part of the Figure 2. Those are very little affected by the process of reworking. On the other hand, the /acs'es gworzz%/ero%3 6%'oc/as^c mw is representated in the high area of the Figure 2. Lastly, the position of the variable PH, GR and FE so near the origin indicate the little influence that they have in the definition of the sedimentary fades. Figure 3 represent the results obtained from the correspondence analysis. These are globally the same ones although they can be tuned more. It suitable to notice in the first place that in this graph the axes and the coordinates that the sediments and the variables have, they don't indicate the same thing that in the representation biplot. These represent the plot in two dimensions of the deviations that there are in the rows (sediments) and the columns (variables) in relation to the independence model among both. In this Figure 3 can be appreciated in the low left area the proximity of most of the samples to the variable Sand and Quartz, closely related, that define the predominant /%c2e,s, and the slipping toward the left of the remaining samples toward the proximity of the variable Mud that defines the second fades. The opposition of the AR and QU in front of the MU are the one that defines the first axis and, therefore, where they are the differences among the sediments throughout it. But also, the proximity of some of them to the variable PH, CA or FE indicates the relative importance of these variables in its composition. So we think that the diversity among them is better explained that in the Figure 1 and 2. This diversity is present also in the second axis, where all the variables except the GR have an influence similar in its definition. The contrast of all them with the variable GR, a not very present component in the sediments, is the one that marks the differences observed in this second axis. 4 Conclusions Three models of multivariate analysis have been used to analyse the fades of the same sample of sediments taken of the sea floor of the Cadiz bay. Basically, the results obtained with them are the same ones and they are in consonance with that exposed in Gutierrez-Mas \ although complementary conclusions can be extracted.
10 12 Applied Sciences and the Environment These methods have allowed us to define the sedimentary fades present in the sea floors of Cadiz bay, arid to differentiate the processes that have rise the sediments. The Q-fact or model and the biplot obtain the same results, because both techniques are based on the SVD of the same data matrix and, therefore, to interpret the factor loadings and scores, the biplot can be used because it gives us a very clear graphic idea in the definition of the fades, when treating sediments and variables jointly. On the other hand, the correspondence analysis also provides a combined plot, where the two-dimensional fit that one makes is an independence model among the rows (sediments) and columns (variables) of the data matrix. In this plot, besides the predominant associations of variables that define the fades, it can be analysed the importance that the other less predominant variables have in each one of the sediments. References [1] Eckart, C. and Young, G. Approximation of one matrix by another of lower rank. Psychometrika,!, pp , [2] Gabriel, K.R. The biplot graphic display of matrices with application to principal components analysis. Biometrika,58, pp , [3] Greenacre, M.J., Theory and Applications of Correspondence Analysis, Academic Press, London, [4] Gutierrez-Mas, J.M., Thesis Univ. of Cadiz. 364 pp. [5] Householder, A.S. and Young,G. Matrix approximation and latent roots. Am. Math. Monthly^, pp , [6] Imbrie, J. and Purdy, E. Classification of modern Bahamian carbonate sediments, Mem. A?ner. Assoc. Petrol. Geol., 7, pp ,1962. [7] Reyrnent, R. and Jreskog, K.G., Applied Factor Analysis in the Natural Sciences, Second edition, Cambridge U.P., 1993.
Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA
Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA Principle Components Analysis: Uses one group of variables (we will call this X) In
More information1 Interpretation. Contents. Biplots, revisited. Biplots, revisited 2. Biplots, revisited 1
Biplots, revisited 1 Biplots, revisited 2 1 Interpretation Biplots, revisited Biplots show the following quantities of a data matrix in one display: Slide 1 Ulrich Kohler kohler@wz-berlin.de Slide 3 the
More information1 A factor can be considered to be an underlying latent variable: (a) on which people differ. (b) that is explained by unknown variables
1 A factor can be considered to be an underlying latent variable: (a) on which people differ (b) that is explained by unknown variables (c) that cannot be defined (d) that is influenced by observed variables
More informationMore Linear Algebra. Edps/Soc 584, Psych 594. Carolyn J. Anderson
More Linear Algebra Edps/Soc 584, Psych 594 Carolyn J. Anderson Department of Educational Psychology I L L I N O I S university of illinois at urbana-champaign c Board of Trustees, University of Illinois
More informationLatent Semantic Indexing (LSI) CE-324: Modern Information Retrieval Sharif University of Technology
Latent Semantic Indexing (LSI) CE-324: Modern Information Retrieval Sharif University of Technology M. Soleymani Fall 2014 Most slides have been adapted from: Profs. Manning, Nayak & Raghavan (CS-276,
More informationLatent Semantic Indexing (LSI) CE-324: Modern Information Retrieval Sharif University of Technology
Latent Semantic Indexing (LSI) CE-324: Modern Information Retrieval Sharif University of Technology M. Soleymani Fall 2016 Most slides have been adapted from: Profs. Manning, Nayak & Raghavan (CS-276,
More informationFACTOR ANALYSIS AND MULTIDIMENSIONAL SCALING
FACTOR ANALYSIS AND MULTIDIMENSIONAL SCALING Vishwanath Mantha Department for Electrical and Computer Engineering Mississippi State University, Mississippi State, MS 39762 mantha@isip.msstate.edu ABSTRACT
More informationLinear Algebra Review. Vectors
Linear Algebra Review 9/4/7 Linear Algebra Review By Tim K. Marks UCSD Borrows heavily from: Jana Kosecka http://cs.gmu.edu/~kosecka/cs682.html Virginia de Sa (UCSD) Cogsci 8F Linear Algebra review Vectors
More informationSediment and Sedimentary rock
Sediment and Sedimentary rock Sediment: An accumulation of loose mineral grains, such as boulders, pebbles, sand, silt or mud, which are not cemented together. Mechanical and chemical weathering produces
More informationSedimentary Rocks. Origin, Properties and Identification. Physical Geology GEOL 101 Lab Ray Rector - Instructor
Sedimentary Rocks Origin, Properties and Identification Physical Geology GEOL 101 Lab Ray Rector - Instructor Sedimentary Rock Origin and Identification Lab Pre-Lab Internet Link Resources 1) http://www.rockhounds.com/rockshop/rockkey/index.html
More informationDiscriminant Correspondence Analysis
Discriminant Correspondence Analysis Hervé Abdi Overview As the name indicates, discriminant correspondence analysis (DCA) is an extension of discriminant analysis (DA) and correspondence analysis (CA).
More informationStructure in Data. A major objective in data analysis is to identify interesting features or structure in the data.
Structure in Data A major objective in data analysis is to identify interesting features or structure in the data. The graphical methods are very useful in discovering structure. There are basically two
More informationFINM 331: MULTIVARIATE DATA ANALYSIS FALL 2017 PROBLEM SET 3
FINM 331: MULTIVARIATE DATA ANALYSIS FALL 2017 PROBLEM SET 3 The required files for all problems can be found in: http://www.stat.uchicago.edu/~lekheng/courses/331/hw3/ The file name indicates which problem
More informationAlgebra of Principal Component Analysis
Algebra of Principal Component Analysis 3 Data: Y = 5 Centre each column on its mean: Y c = 7 6 9 y y = 3..6....6.8 3. 3.8.6 Covariance matrix ( variables): S = -----------Y n c ' Y 8..6 c =.6 5.8 Equation
More informationSingular Value Decompsition
Singular Value Decompsition Massoud Malek One of the most useful results from linear algebra, is a matrix decomposition known as the singular value decomposition It has many useful applications in almost
More informationSedimentary Rocks. Origin, Properties and Identification. Geology Laboratory GEOL 101 Lab Ray Rector - Instructor
Sedimentary Rocks Origin, Properties and Identification Geology Laboratory GEOL 101 Lab Ray Rector - Instructor Sedimentary Rock Origin and Identification Lab Pre-Lab Internet Link Resources 1) http://www.rockhounds.com/rockshop/rockkey/index.html
More informationYORK UNIVERSITY. Faculty of Science Department of Mathematics and Statistics MATH M Test #1. July 11, 2013 Solutions
YORK UNIVERSITY Faculty of Science Department of Mathematics and Statistics MATH 222 3. M Test # July, 23 Solutions. For each statement indicate whether it is always TRUE or sometimes FALSE. Note: For
More informationUNIT 6: The singular value decomposition.
UNIT 6: The singular value decomposition. María Barbero Liñán Universidad Carlos III de Madrid Bachelor in Statistics and Business Mathematical methods II 2011-2012 A square matrix is symmetric if A T
More informationPrincipal Component Analysis
I.T. Jolliffe Principal Component Analysis Second Edition With 28 Illustrations Springer Contents Preface to the Second Edition Preface to the First Edition Acknowledgments List of Figures List of Tables
More informationLinear Algebra (Review) Volker Tresp 2017
Linear Algebra (Review) Volker Tresp 2017 1 Vectors k is a scalar (a number) c is a column vector. Thus in two dimensions, c = ( c1 c 2 ) (Advanced: More precisely, a vector is defined in a vector space.
More informationPrincipal Component Analysis vs. Independent Component Analysis for Damage Detection
6th European Workshop on Structural Health Monitoring - Fr..D.4 Principal Component Analysis vs. Independent Component Analysis for Damage Detection D. A. TIBADUIZA, L. E. MUJICA, M. ANAYA, J. RODELLAR
More information1 Linearity and Linear Systems
Mathematical Tools for Neuroscience (NEU 34) Princeton University, Spring 26 Jonathan Pillow Lecture 7-8 notes: Linear systems & SVD Linearity and Linear Systems Linear system is a kind of mapping f( x)
More informationUnconstrained Ordination
Unconstrained Ordination Sites Species A Species B Species C Species D Species E 1 0 (1) 5 (1) 1 (1) 10 (4) 10 (4) 2 2 (3) 8 (3) 4 (3) 12 (6) 20 (6) 3 8 (6) 20 (6) 10 (6) 1 (2) 3 (2) 4 4 (5) 11 (5) 8 (5)
More informationSedimentary Rocks. Origin, Properties and Identification. Physical Geology GEOL 100. Ray Rector - Instructor
Sedimentary Rocks Origin, Properties and Identification Physical Geology GEOL 100 Ray Rector - Instructor Sedimentary Rock Origin and Identification Lab Pre-Lab Internet Link Resources 1) http://www.rockhounds.com/rockshop/rockkey/index.html
More informationProbabilistic Latent Semantic Analysis
Probabilistic Latent Semantic Analysis Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr
More informationBare minimum on matrix algebra. Psychology 588: Covariance structure and factor models
Bare minimum on matrix algebra Psychology 588: Covariance structure and factor models Matrix multiplication 2 Consider three notations for linear combinations y11 y1 m x11 x 1p b11 b 1m y y x x b b n1
More informationLarge Scale Data Analysis Using Deep Learning
Large Scale Data Analysis Using Deep Learning Linear Algebra U Kang Seoul National University U Kang 1 In This Lecture Overview of linear algebra (but, not a comprehensive survey) Focused on the subset
More informationMultivariate Statistics (I) 2. Principal Component Analysis (PCA)
Multivariate Statistics (I) 2. Principal Component Analysis (PCA) 2.1 Comprehension of PCA 2.2 Concepts of PCs 2.3 Algebraic derivation of PCs 2.4 Selection and goodness-of-fit of PCs 2.5 Algebraic derivation
More informationCS168: The Modern Algorithmic Toolbox Lecture #10: Tensors, and Low-Rank Tensor Recovery
CS168: The Modern Algorithmic Toolbox Lecture #10: Tensors, and Low-Rank Tensor Recovery Tim Roughgarden & Gregory Valiant May 3, 2017 Last lecture discussed singular value decomposition (SVD), and we
More informationMatrix & Linear Algebra
Matrix & Linear Algebra Jamie Monogan University of Georgia For more information: http://monogan.myweb.uga.edu/teaching/mm/ Jamie Monogan (UGA) Matrix & Linear Algebra 1 / 84 Vectors Vectors Vector: A
More informationA A x i x j i j (i, j) (j, i) Let. Compute the value of for and
7.2 - Quadratic Forms quadratic form on is a function defined on whose value at a vector in can be computed by an expression of the form, where is an symmetric matrix. The matrix R n Q R n x R n Q(x) =
More informationDimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes. October 3, Statistics 202: Data Mining
Dimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes October 3, 2012 1 / 1 Combinations of features Given a data matrix X n p with p fairly large, it can
More informationEvaluation of scoring index with different normalization and distance measure with correspondence analysis
Evaluation of scoring index with different normalization and distance measure with correspondence analysis Anders Nilsson Master Thesis in Statistics 15 ECTS Spring Semester 2010 Supervisors: Professor
More informationIntroduction to Machine Learning
10-701 Introduction to Machine Learning PCA Slides based on 18-661 Fall 2018 PCA Raw data can be Complex, High-dimensional To understand a phenomenon we measure various related quantities If we knew what
More informationSingular Value Decomposition
Singular Value Decomposition Motivatation The diagonalization theorem play a part in many interesting applications. Unfortunately not all matrices can be factored as A = PDP However a factorization A =
More informationDIMENSION REDUCTION AND CLUSTER ANALYSIS
DIMENSION REDUCTION AND CLUSTER ANALYSIS EECS 833, 6 March 2006 Geoff Bohling Assistant Scientist Kansas Geological Survey geoff@kgs.ku.edu 864-2093 Overheads and resources available at http://people.ku.edu/~gbohling/eecs833
More informationMatrices and Vectors. Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A =
30 MATHEMATICS REVIEW G A.1.1 Matrices and Vectors Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A = a 11 a 12... a 1N a 21 a 22... a 2N...... a M1 a M2... a MN A matrix can
More informationlinearly indepedent eigenvectors as the multiplicity of the root, but in general there may be no more than one. For further discussion, assume matrice
3. Eigenvalues and Eigenvectors, Spectral Representation 3.. Eigenvalues and Eigenvectors A vector ' is eigenvector of a matrix K, if K' is parallel to ' and ' 6, i.e., K' k' k is the eigenvalue. If is
More informationPrincipal component analysis
Principal component analysis Motivation i for PCA came from major-axis regression. Strong assumption: single homogeneous sample. Free of assumptions when used for exploration. Classical tests of significance
More informationThis is a closed book exam. No notes or calculators are permitted. We will drop your lowest scoring question for you.
Math 54 Fall 2017 Practice Final Exam Exam date: 12/14/17 Time Limit: 170 Minutes Name: Student ID: GSI or Section: This exam contains 9 pages (including this cover page) and 10 problems. Problems are
More informationMaths for Signals and Systems Linear Algebra in Engineering
Maths for Signals and Systems Linear Algebra in Engineering Lecture 18, Friday 18 th November 2016 DR TANIA STATHAKI READER (ASSOCIATE PROFFESOR) IN SIGNAL PROCESSING IMPERIAL COLLEGE LONDON Mathematics
More informationDimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas
Dimensionality Reduction: PCA Nicholas Ruozzi University of Texas at Dallas Eigenvalues λ is an eigenvalue of a matrix A R n n if the linear system Ax = λx has at least one non-zero solution If Ax = λx
More informationMath Fall Final Exam
Math 104 - Fall 2008 - Final Exam Name: Student ID: Signature: Instructions: Print your name and student ID number, write your signature to indicate that you accept the honor code. During the test, you
More informationReview of similarity transformation and Singular Value Decomposition
Review of similarity transformation and Singular Value Decomposition Nasser M Abbasi Applied Mathematics Department, California State University, Fullerton July 8 7 page compiled on June 9, 5 at 9:5pm
More informationBiplots in Practice MICHAEL GREENACRE. Professor of Statistics at the Pompeu Fabra University. Chapter 6 Offprint
Biplots in Practice MICHAEL GREENACRE Proessor o Statistics at the Pompeu Fabra University Chapter 6 Oprint Principal Component Analysis Biplots First published: September 010 ISBN: 978-84-93846-8-6 Supporting
More informationAssignment #10: Diagonalization of Symmetric Matrices, Quadratic Forms, Optimization, Singular Value Decomposition. Name:
Assignment #10: Diagonalization of Symmetric Matrices, Quadratic Forms, Optimization, Singular Value Decomposition Due date: Friday, May 4, 2018 (1:35pm) Name: Section Number Assignment #10: Diagonalization
More informationMultivariate Statistics Fundamentals Part 1: Rotation-based Techniques
Multivariate Statistics Fundamentals Part 1: Rotation-based Techniques A reminded from a univariate statistics courses Population Class of things (What you want to learn about) Sample group representing
More informationLinear Dimensionality Reduction
Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Principal Component Analysis 3 Factor Analysis
More informationEIGENVALUES AND SINGULAR VALUE DECOMPOSITION
APPENDIX B EIGENVALUES AND SINGULAR VALUE DECOMPOSITION B.1 LINEAR EQUATIONS AND INVERSES Problems of linear estimation can be written in terms of a linear matrix equation whose solution provides the required
More informationPrincipal Component Analysis. Applied Multivariate Statistics Spring 2012
Principal Component Analysis Applied Multivariate Statistics Spring 2012 Overview Intuition Four definitions Practical examples Mathematical example Case study 2 PCA: Goals Goal 1: Dimension reduction
More informationMS-E2112 Multivariate Statistical Analysis (5cr) Lecture 6: Bivariate Correspondence Analysis - part II
MS-E2112 Multivariate Statistical Analysis (5cr) Lecture 6: Bivariate Correspondence Analysis - part II the Contents the the the Independence The independence between variables x and y can be tested using.
More informationPrincipal Component Analysis & Factor Analysis. Psych 818 DeShon
Principal Component Analysis & Factor Analysis Psych 818 DeShon Purpose Both are used to reduce the dimensionality of correlated measurements Can be used in a purely exploratory fashion to investigate
More informationWe use the overhead arrow to denote a column vector, i.e., a number with a direction. For example, in three-space, we write
1 MATH FACTS 11 Vectors 111 Definition We use the overhead arrow to denote a column vector, ie, a number with a direction For example, in three-space, we write The elements of a vector have a graphical
More informationStudy Notes on Matrices & Determinants for GATE 2017
Study Notes on Matrices & Determinants for GATE 2017 Matrices and Determinates are undoubtedly one of the most scoring and high yielding topics in GATE. At least 3-4 questions are always anticipated from
More informationMath 408 Advanced Linear Algebra
Math 408 Advanced Linear Algebra Chi-Kwong Li Chapter 4 Hermitian and symmetric matrices Basic properties Theorem Let A M n. The following are equivalent. Remark (a) A is Hermitian, i.e., A = A. (b) x
More informationPENN STATE UNIVERSITY MATH 220: LINEAR ALGEBRA
PENN STATE UNIVERSITY MATH 220: LINEAR ALGEBRA Penn State Bluebook: 1. Systems of Linear Equations 2. Matrix Algebra 3. Eigenvalues and Eigenvectors 4. Linear Systems of Differential Equations The above
More informationMultivariate Data Analysis a survey of data reduction and data association techniques: Principal Components Analysis
Multivariate Data Analysis a survey of data reduction and data association techniques: Principal Components Analysis For example Data reduction approaches Cluster analysis Principal components analysis
More informationPrincipal Component Analysis
Principal Component Analysis Giorgos Korfiatis Alfa-Informatica University of Groningen Seminar in Statistics and Methodology, 2007 What Is PCA? Dimensionality reduction technique Aim: Extract relevant
More informationCorrespondence Analysis & Related Methods
Correspondence Analysis & Related Methods Michael Greenacre SESSION 9: CA applied to rankings, preferences & paired comparisons Correspondence analysis (CA) can also be applied to other types of data:
More informationFundamentals of Engineering Analysis (650163)
Philadelphia University Faculty of Engineering Communications and Electronics Engineering Fundamentals of Engineering Analysis (6563) Part Dr. Omar R Daoud Matrices: Introduction DEFINITION A matrix is
More informationGEOL Lab 9 (Carbonate Sedimentary Rocks in Hand Sample and Thin Section)
GEOL 333 - Lab 9 (Carbonate Sedimentary Rocks in Hand Sample and Thin Section) Sedimentary Rock Classification - As we learned last week, sedimentary rock, which forms by accumulation and lithification
More information7. Symmetric Matrices and Quadratic Forms
Linear Algebra 7. Symmetric Matrices and Quadratic Forms CSIE NCU 1 7. Symmetric Matrices and Quadratic Forms 7.1 Diagonalization of symmetric matrices 2 7.2 Quadratic forms.. 9 7.4 The singular value
More informationBASIC NOTIONS. x + y = 1 3, 3x 5y + z = A + 3B,C + 2D, DC are not defined. A + C =
CHAPTER I BASIC NOTIONS (a) 8666 and 8833 (b) a =6,a =4 will work in the first case, but there are no possible such weightings to produce the second case, since Student and Student 3 have to end up with
More informationSingular Value Decomposition
Singular Value Decomposition CS 205A: Mathematical Methods for Robotics, Vision, and Graphics Doug James (and Justin Solomon) CS 205A: Mathematical Methods Singular Value Decomposition 1 / 35 Understanding
More informationMATH 3321 Sample Questions for Exam 3. 3y y, C = Perform the indicated operations, if possible: (a) AC (b) AB (c) B + AC (d) CBA
MATH 33 Sample Questions for Exam 3. Find x and y so that x 4 3 5x 3y + y = 5 5. x = 3/7, y = 49/7. Let A = 3 4, B = 3 5, C = 3 Perform the indicated operations, if possible: a AC b AB c B + AC d CBA AB
More informationClusters. Unsupervised Learning. Luc Anselin. Copyright 2017 by Luc Anselin, All Rights Reserved
Clusters Unsupervised Learning Luc Anselin http://spatial.uchicago.edu 1 curse of dimensionality principal components multidimensional scaling classical clustering methods 2 Curse of Dimensionality 3 Curse
More informationRoundoff Error. Monday, August 29, 11
Roundoff Error A round-off error (rounding error), is the difference between the calculated approximation of a number and its exact mathematical value. Numerical analysis specifically tries to estimate
More informationMatrix Factorizations
1 Stat 540, Matrix Factorizations Matrix Factorizations LU Factorization Definition... Given a square k k matrix S, the LU factorization (or decomposition) represents S as the product of two triangular
More informationLECTURE NOTE #10 PROF. ALAN YUILLE
LECTURE NOTE #10 PROF. ALAN YUILLE 1. Principle Component Analysis (PCA) One way to deal with the curse of dimensionality is to project data down onto a space of low dimensions, see figure (1). Figure
More information.. CSC 566 Advanced Data Mining Alexander Dekhtyar..
.. CSC 566 Advanced Data Mining Alexander Dekhtyar.. Information Retrieval Latent Semantic Indexing Preliminaries Vector Space Representation of Documents: TF-IDF Documents. A single text document is a
More informationUNIT 4 SEDIMENTARY ROCKS
UNIT 4 SEDIMENTARY ROCKS WHAT ARE SEDIMENTS Sediments are loose Earth materials (unconsolidated materials) such as sand which are transported by the action of water, wind, glacial ice and gravity. These
More informationA Multivariate Perspective
A Multivariate Perspective on the Analysis of Categorical Data Rebecca Zwick Educational Testing Service Ellijot M. Cramer University of North Carolina at Chapel Hill Psychological research often involves
More informationFactor Analysis of Data Matrices
Factor Analysis of Data Matrices PAUL HORST University of Washington HOLT, RINEHART AND WINSTON, INC. New York Chicago San Francisco Toronto London Contents Preface PART I. Introductory Background 1. The
More informationInformation Retrieval
Introduction to Information CS276: Information and Web Search Christopher Manning and Pandu Nayak Lecture 13: Latent Semantic Indexing Ch. 18 Today s topic Latent Semantic Indexing Term-document matrices
More informationLAB 2 IDENTIFYING MATERIALS FOR MAKING SOILS: ROCK AND PARENT MATERIALS
LAB 2 IDENTIFYING MATERIALS FOR MAKING SOILS: ROCK AND PARENT MATERIALS Learning outcomes The student is able to: 1. understand and identify rocks 2. understand and identify parent materials 3. recognize
More informationNotes on singular value decomposition for Math 54. Recall that if A is a symmetric n n matrix, then A has real eigenvalues A = P DP 1 A = P DP T.
Notes on singular value decomposition for Math 54 Recall that if A is a symmetric n n matrix, then A has real eigenvalues λ 1,, λ n (possibly repeated), and R n has an orthonormal basis v 1,, v n, where
More informationLecture 5 Singular value decomposition
Lecture 5 Singular value decomposition Weinan E 1,2 and Tiejun Li 2 1 Department of Mathematics, Princeton University, weinan@princeton.edu 2 School of Mathematical Sciences, Peking University, tieli@pku.edu.cn
More informationMATH 312 Section 8.3: Non-homogeneous Systems
MATH 32 Section 8.3: Non-homogeneous Systems Prof. Jonathan Duncan Walla Walla College Spring Quarter, 2007 Outline Undetermined Coefficients 2 Variation of Parameter 3 Conclusions Undetermined Coefficients
More informationSingular Value Decomposition. 1 Singular Value Decomposition and the Four Fundamental Subspaces
Singular Value Decomposition This handout is a review of some basic concepts in linear algebra For a detailed introduction, consult a linear algebra text Linear lgebra and its pplications by Gilbert Strang
More informationCorrespondence Analysis of Longitudinal Data
Correspondence Analysis of Longitudinal Data Mark de Rooij* LEIDEN UNIVERSITY, LEIDEN, NETHERLANDS Peter van der G. M. Heijden UTRECHT UNIVERSITY, UTRECHT, NETHERLANDS *Corresponding author (rooijm@fsw.leidenuniv.nl)
More informationLinear Methods in Data Mining
Why Methods? linear methods are well understood, simple and elegant; algorithms based on linear methods are widespread: data mining, computer vision, graphics, pattern recognition; excellent general software
More informationStatistics 202: Data Mining. c Jonathan Taylor. Week 2 Based in part on slides from textbook, slides of Susan Holmes. October 3, / 1
Week 2 Based in part on slides from textbook, slides of Susan Holmes October 3, 2012 1 / 1 Part I Other datatypes, preprocessing 2 / 1 Other datatypes Document data You might start with a collection of
More informationDimension Reduction Techniques. Presented by Jie (Jerry) Yu
Dimension Reduction Techniques Presented by Jie (Jerry) Yu Outline Problem Modeling Review of PCA and MDS Isomap Local Linear Embedding (LLE) Charting Background Advances in data collection and storage
More informationPart I. Other datatypes, preprocessing. Other datatypes. Other datatypes. Week 2 Based in part on slides from textbook, slides of Susan Holmes
Week 2 Based in part on slides from textbook, slides of Susan Holmes Part I Other datatypes, preprocessing October 3, 2012 1 / 1 2 / 1 Other datatypes Other datatypes Document data You might start with
More information(a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? Solution: dim N(A) 1, since rank(a) 3. Ax =
. (5 points) (a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? dim N(A), since rank(a) 3. (b) If we also know that Ax = has no solution, what do we know about the rank of A? C(A)
More informationMoore Penrose inverses and commuting elements of C -algebras
Moore Penrose inverses and commuting elements of C -algebras Julio Benítez Abstract Let a be an element of a C -algebra A satisfying aa = a a, where a is the Moore Penrose inverse of a and let b A. We
More informationLinear Algebra (Review) Volker Tresp 2018
Linear Algebra (Review) Volker Tresp 2018 1 Vectors k, M, N are scalars A one-dimensional array c is a column vector. Thus in two dimensions, ( ) c1 c = c 2 c i is the i-th component of c c T = (c 1, c
More informationProblem # Max points possible Actual score Total 120
FINAL EXAMINATION - MATH 2121, FALL 2017. Name: ID#: Email: Lecture & Tutorial: Problem # Max points possible Actual score 1 15 2 15 3 10 4 15 5 15 6 15 7 10 8 10 9 15 Total 120 You have 180 minutes to
More informationDeep Learning Book Notes Chapter 2: Linear Algebra
Deep Learning Book Notes Chapter 2: Linear Algebra Compiled By: Abhinaba Bala, Dakshit Agrawal, Mohit Jain Section 2.1: Scalars, Vectors, Matrices and Tensors Scalar Single Number Lowercase names in italic
More informationIntroduction to multivariate analysis Outline
Introduction to multivariate analysis Outline Why do a multivariate analysis Ordination, classification, model fitting Principal component analysis Discriminant analysis, quickly Species presence/absence
More informationGY 402: Sedimentary Petrology
UNIVERSITY OF SOUTH ALABAMA GY 402: Sedimentary Petrology Lecture 13: Immature Siliciclastic Sedimentary Environments Alluvial Fans, Braided Streams Instructor: Dr. Douglas W. Haywick Last Time Immature
More informationThe SVD-Fundamental Theorem of Linear Algebra
Nonlinear Analysis: Modelling and Control, 2006, Vol. 11, No. 2, 123 136 The SVD-Fundamental Theorem of Linear Algebra A. G. Akritas 1, G. I. Malaschonok 2, P. S. Vigklas 1 1 Department of Computer and
More informationMath Camp Notes: Linear Algebra II
Math Camp Notes: Linear Algebra II Eigenvalues Let A be a square matrix. An eigenvalue is a number λ which when subtracted from the diagonal elements of the matrix A creates a singular matrix. In other
More informationLecture Outline Wednesday - Friday February 14-16, 2018
Lecture Outline Wednesday - Friday February 14-16, 2018 Quiz 2 scheduled for Friday Feb 23 (Interlude B, Chapters 6,7) Questions? Chapter 6 Pages of the Past: Sedimentary Rocks Key Points for today Be
More informationFinal Exam, Linear Algebra, Fall, 2003, W. Stephen Wilson
Final Exam, Linear Algebra, Fall, 2003, W. Stephen Wilson Name: TA Name and section: NO CALCULATORS, SHOW ALL WORK, NO OTHER PAPERS ON DESK. There is very little actual work to be done on this exam if
More informationPrincipal Component Analysis-I Geog 210C Introduction to Spatial Data Analysis. Chris Funk. Lecture 17
Principal Component Analysis-I Geog 210C Introduction to Spatial Data Analysis Chris Funk Lecture 17 Outline Filters and Rotations Generating co-varying random fields Translating co-varying fields into
More informationFocus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations.
Previously Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations y = Ax Or A simply represents data Notion of eigenvectors,
More informationChapter 11 Canonical analysis
Chapter 11 Canonical analysis 11.0 Principles of canonical analysis Canonical analysis is the simultaneous analysis of two, or possibly several data tables. Canonical analyses allow ecologists to perform
More informationLecture 5: Ecological distance metrics; Principal Coordinates Analysis. Univariate testing vs. community analysis
Lecture 5: Ecological distance metrics; Principal Coordinates Analysis Univariate testing vs. community analysis Univariate testing deals with hypotheses concerning individual taxa Is this taxon differentially
More informationLinear Algebra Review
January 29, 2013 Table of contents Metrics Metric Given a space X, then d : X X R + 0 and z in X if: d(x, y) = 0 is equivalent to x = y d(x, y) = d(y, x) d(x, y) d(x, z) + d(z, y) is a metric is for all
More information