Principal Component Analysis-I Geog 210C Introduction to Spatial Data Analysis. Chris Funk. Lecture 17

Size: px

Start display at page:

Download "Principal Component Analysis-I Geog 210C Introduction to Spatial Data Analysis. Chris Funk. Lecture 17"

Shona Cain
5 years ago
Views:

1 Principal Component Analysis-I Geog 210C Introduction to Spatial Data Analysis Chris Funk Lecture 17

2 Outline Filters and Rotations Generating co-varying random fields Translating co-varying fields into independent data sets Principal Components and Empirical Orthogonal Functions Singular Value Decomposition The rank order of a data set how many degrees of freedom are there really? Examples - PCA 2

3 Class Calendar Classes Tuesday May 24 th Principal Components I Thursday May 26 th Principal Components II Lab - Principal Components Tuesday May 31 st Simulating Structured Data Thursday Empirical Orthogonal Distance Functions No Final 3

4 Revisit to Block Kriging-I 4

5 Revisit to Block Kriging-II SPI - 1 sd Avg SPI SPI + 1 sd Lower 95% Bound Mean SPI Upper 95% Bound

6 Backwards and Forwards decompositions Many Correlated Data Uncorrelated components Covariance Matrix De/composition 6 In general, given a set of correlated (covarying) data, of rank n, the covariance matrix can be used to translate this data into a set of n independent vectors While bearing many names, these independent vectors are often called principal components Because these components are uncorrelated They can be easily used in statistical estimation, significance testing, etc. The components can compactly describe multivariate datasets Sometimes (but sometimes not) the components can correspond to real things in the real world We can also go the other way: given a set of independent (uncorrelated) vectors, we can impose a specific correlation structure, creating a set of n C. dependent Funk Geog 210C vectors Spring 2011

7 Simulating co-varying data I Types of covarying data: Different process types, with interacting effects e.g. Weight and height Similar process types with spatial or temporal persistence Auto-regressive or moving average temporal process Correlated random fields in time Spatially co-varying processes Correlated random fields in time One way to simulate covariance is by a Choleski decomposition of the covariance matrix (C) The Cholesky decomposition (U) is like the square root of a matrix C = LU = U T U L U C 7 a 0 0 x b 0 y z c X a x y 0 b z 0 0 c = σ 1 σ 2,3 σ 1,3 σ 1,2 σ 2 σ 2,3 σ 1,3 σ 1,2 σ 3

8 Simulating co-varying data II Z Given a matrix Z of uncorrelated vectors, we can transform them into a new dataset (X) with covariance C by multiplying Z by U d 1 e 1 f 1 d 2 b 2 f 2 d n e n f n U A x y X 0 b z = 0 0 c d 1 a d 1 x+e 1 b yd 1 +e 1 z+f 1 c d 2 a d 2 x+e 2 b yd 2 +e 2 z+f 2 c d n a d n x+e n b yd n +e n z+f n c X Example spreadsheet: Here 8

9 Potential Application: Simulating time-series We can model many time-series by assuming they are a combination of lagged information + plus a random walk: 1 st order auto-regressive function (AR1) y t = b 1 y t-1 + z 2 nd st order auto-regressive function (AR2) y t = b 1 y t-1 + b 2 y t-2 +z 3 rd order auto-regressive function (AR3) y t = b 1 y t-1 + b 2 y t-2 + b 3 y t-3 + z The terms b 1, b 2 b n specify the temporal autocorrelation 9

10 Principal Component Analysis Also referred to as empirical orthogonal functions and factor analysis Translates a multivariate set of data into a set of uncorrelated principal components, such that The 1 st (and each successive component) explains the most variance possible X PC scores PC loadings d 1 e 1 f 1 d 2 b 2 f 2 d n e n f n = PC1 1 PC2 1. PC3 1 PC1 2 PC2 2. PC3 2 PC1 n PC1 n PC1 n X w d PC1 w d PC2 w d PC3 w e PC1 w e PC2 w e PC3 w n PC1 w n PC2 w e PCN 1 0

11 Formal Definition Let X be a centered multivariate dataset with m variables (columns) and n observations (rows) Let ε 1 be an m-length vector defining the pattern that explains the most variance in X ε 1 essentially describes a set of slopes along each variable in m-space Constrain the unit norm of ε 1 to unity ε 1 (Σε 12 )^0.5 = (ε 1T ε 1 )^0.5 = 1 Now find ε 1 that explains the most error by minimizing the residual Residual=E( X- Xε 1 2 ) 11

12 Bases of linear analysis A collection of vectors forms a linear basis for an m- dimensional vector space V if for any vector a in V there exists a set of coefficients a, I = 1. M such that a = Σ (α i ε i ) An orthogonal basis is a linear basis consisting of vectors ε i that are mutually orthogonal, i.e. ε it ε j = 0, for i j The set of vectors are orthonormal if ε i (Σε i2 )^0.5 = (ε it ε i )^0.5 = 1 For all i = 1 m 1 2

13 Eigenvalues and Eigenvectors-I The PCA transform is derived by decomposing the covariance matrix into its eigenvectors and eigenvalues Eigenvalues and Eigenvectors If C is an m by m matrix, then the number λ is said to be an eigenvalue of C if Cε = λ ε The vector ε is said to be an eigenvector of C ε uniquely determines a direction in the m-dimensional data space We can use eigenvalues and eigenvectors to solve for our PCA components Let E be a matrix of eigenvectors [ε 1, ε 2, ε m ] Let Λ be a diagonal matrix with the eigenvalues along the diagonal, Λ i,i = λ i Now C = EΛE Each eigenvalue is proportional to the total variance explained by the i th component 1 3

14 Eigenvalues and Eigenvectors-II The Eigenvector matrix has the property of being uncorrelated E T E= EE T = I I is the identity matrix Since the off-diagonal components are zero, the resulting PC timeseries will be uncorrelated ε i that are mutually orthogonal, i.e. ε it ε j = 0, for i j The set of vectors are orthonormal if ε i (Σε i2 )^0.5 = (ε it ε i )^0.5 = 1 For all i = 1 m Eigenvalues and Eigenvectors If C is an m by m matrix, then the number λ is said to be an eigenvalue of C if Cε = λ ε The vector ε is said to be an eigenvector of C ε uniquely determines a direction in the m-dimensional data space We can use eigenvalues and eigenvectors to solve for our PCA components Let E be a matrix of eigenvalues [ε 1, ε 2, ε m ] 1 4 Let Λ be a diagonal matrix with the eigenvalues along the diagonal, Λ i,i = λ i Now C = EΛE Each eigenvalue is proportional to the total variance explained by the i th component

15 PC Example Correlated data - I Covariance Matrix Data1 Data2 Data3 Data Data Data Importance of components: Comp.1 Comp.2 Comp.3 Standard deviation Proportion of Variance Cumulative Proportion

16 PC Example Correlated data - II Loadings: Comp.1 Comp.2 Comp.3 Data Data Data Comp.1 Comp.2 Comp.3 [1,] [2,] [3,]

17 Example Landsat PCs Blue µ Green µ Red µ Near IR µ SW IR µ 1 7

18 Band 1, 3, 5 as RGB 1 8

19 Correlation Matrix & Eigenvectors Correlation Blue Green Red NIR SW IR Band Band Band Band Band Eigenvector Blue Green Red NIR SW IR Brightness Greenness PC1 = Brightness = R+G+B+SW IR + NIR PC2 = Greenness = R+G+B - SW IR - NIR 1 9

20 PC1 & PC2 PC1 = Brightness = R+G+B+SW IR + NIR PC2 = Greenness = R+G+B - SW IR - NIR PC1 = Red and Green PC2 = Blue 2 0

Multivariate Statistical Analysis

Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 4 for Applied Multivariate Analysis Outline 1 Eigen values and eigen vectors Characteristic equation Some properties of eigendecompositions