Empirical Orthogonal Function (EOF) (Lorenz, 1956) Hotelling, H., 1935: The most predictable criterion. J. Ed. Phych., 26,

Size: px

Start display at page:

Download "Empirical Orthogonal Function (EOF) (Lorenz, 1956) Hotelling, H., 1935: The most predictable criterion. J. Ed. Phych., 26,"

Mervin Robbins
6 years ago
Views:

1 Principal Component Analysis (PCA) or Empirical Orthogonal Function (EOF) (Lorenz, 1956) Hotelling, H., 1935: The most predictable criterion. J. Ed. Phych., 26, (from Jackson, 1991 and Graham,1996 (class notes)) Jackson, J. E., 1991: A user s guide to principal components. Wiley & Sons, Inc., 569 pp. Objective: Reduce a data set containing a large number of variables to a data set containing many fewer variables, but that still represents a large fraction of the variability contained in the original data set. In a field: What are the important patterns that appears in this field and how much of the variance is explained by them? 1

2 Given a set of data X we want to find a transformation of X (say U) that has the maximum variance of all such linear transformations. Var (XU) = max x 1 x r = s ( s s ) = 0.89 x = x x 1 2 = S = s s s s =

3 If X is the data set and U the transformation, the transformed data will be: and Z = XU 2 (z ) Z' Z var(z) = = (in matrix n 1 n 1 (XU)'(XU) U' X' XU = = n 1 n 1 notation) but X' X n 1 is the variance covariance matrix S so that var(z) = U SU and we want it to be max To solve the maximization problem, 1- constrain U to have unit length, so that: U U = 1 2- define a Lagrange multiplier to allow the side constraint into the problem (Lagrange multiplier is used to find the extreme of f(x) subject to the constraint g(x) = c), so that the problem becomes: var(z) = f(u) = U SU λ(u U I) = max f(u) or = 0 *** U (S λi) U = 0 3

4 Eigenstructure of a square matrix A For a square matrix A, there is a vector E such that: AE = λe where λ is scalar. E contains the so called eigenvectors and λ the eigenvalues How to calculate λ and E? AE = λe AE λe = 0 AIE λie = 0 (A λi)e = 0 if E 0 - Solving for λ: If (A λi) has an inverse (it is non-singular and det (A λi) 0) then (A λi) -1 (A λi) E = 0 it means that E = 0 (no good) Then (A λi) must be singular, i.e., det (A λi) = 0 and λ can be determined from this equation (λ has as many values as the dimension of A). - Solving for E is just to substitute each value of λ in (A λi)e = 0 and find E for each of them. We can determine the eigenstructure of a square matrix from the so called the characteristic equation (A λi)e = 0 4

5 Comparing the characteristic equation to the equation from the maximization problem, we can see that they have the same form. (S λi) U = 0 and (A λi)e = 0 It means that the variation maximizing transformation of S is given by its eigenvectors, and the eigenmode with the largest eigenvalues gives the transformation with the largest variance. Due to U U=1, the variance represented by each eigenmode is equal to the eigenvalues and n i= 1 λ i = total variance Graphically, the eigenvectors give the axis of deformation and the eigenvalues, the magnitude

6 Applying (S λi) U = 0 to our data set: 1- Solving for λ: *** λ 0 det = λ λ 1 = λ 2 = Solving for U: *** e e = e e = 0 0 e 1 = e e = and e 2 = e e = so 1.0 E = The eigenvectors are often normalized so that they can be compared, and that is easy. en = i e i ' i e e i Each column of E is called principal component or mode. 6

7 Import property!! E is orthonornal e 1 e 1 = 1; e 2 e 2 = 1 e 1 e 2 = 0 ' i e e j = 1, i = 0,i j j The new variables u m, that will account successively for the maximum amount of variance are calculated by projecting the old variables (x) in the new axis (eigenvectors). u = E X u 1 =e 1 X is the linear combination of elements of X with the greatest variance. u 2 =e 2 X is the linear combination with the greatest variance that is uncorrelated with u 1, and so on. PCA is possible to be applied to a data set if there are substantial correlations among the variables contained in the original data set (it means redundant information). The elements of the new vectors are called the principal components and are uncorrelated (orthogonal). 7

8 In practice: Given a data set X nt,nx that has the dimension time x variable 1- Standardize or Normalize your data. 2- Extract the mean per column. For better physical interpretation, PCA is only applied on centered data (anomalies). 3- Calculate the variance covariance matrix of X (S). 4- Obtain the eigenvalues and eigenvectors of S using whatever means you like. 5- Plot your eigenvectors (map or graphic). Each explains λ*100% of the total variance. 6- Analyze your results. 8

9 Given a set of data represented as a matrix X in which each column represents a variable: X = x 1,1 x 1,2 x 1,3 x 1,m x 2,1 x 2,2 x 2,3 x 2,m x 3,1 x 3,2 x 3,3 x 3,m x n,1 x n,2 x n,3 x n,m 1- PCA can only be conducted on centered data (anomalies), i.e., the average value by column must be zero. 2- PCA (EOF) can be applied both to the covariance or to the correlation matrix but the covariance matrix can be used only when all variables have the same units (variance has unit). 3- One can always use covariance matrix if the data is normalized before. 4- The correlation matrix can be used always (correlation has no unit). 9

10 Analysis of the PC / EOF eigenvectors: New time series: projection of the original data on the new axes u = E X u 1 =e 1 X is the linear combination of elements of X with the greatest variance. u 2 =e 2 X is the linear combination with the greatest variance that is uncorrelated with u 1, and so on. 10

11 Plot the new time series and analyze them. e.g.: Fraedrich (1993) Mean zonal wind over Singapure 11

12 Maps - eigenvectors - local variance (squared correlation between the time series and the original data) If you have station data: Uvo (2003) Stn n time t Interpolate the data in grid points (griddata on Matlab) Plot the line contours 12

13 13

14 14

15 In case your data is given as grid points on a maps you have to redistribute it so the you can calculate EOF. t 1 t 2 t n t 1 t 2 t n Follow steps 1 to 6 on page 8 Return the eigenvectors to map format to plot them. Important: Before applying PCA, Replace missing data. data. Missing data should not be more than 10% of the total Exclude land grid points (in case of oceanic data) or ocean grid points (in case of land data). 15

16 Gurgel and Ferreira (2003) % mean vegetation 0.12% summer/winter 0.04% spring/autumn 0.03% semi annual cycle 16

17 Gurgel and Ferreira (2003) 17

18 Greatbatch 2000 Artic Oscillation 18

19 Extended EOF EEOF The ordinary EOF provides a snap shot of the spatially evolving behaviour of your data. Suppose you are studying a phenomenon that varies in time and space. How to solve this problem? time map 1 y 1 y 2 y 3 y n-1 y n 2 y 1 y 2 y 3 y n-1 y n 3 y 1 y 2 y 3 y n-1 y n t y 1 y 2 y 3 y n-1 y n Time 1 map1 map2 map3 map4 2 map2 map3 map4 map5 3 map3 map4 map5 map6 4 map4 map5 map6 map7 t-3 map t-3 map t-2 map t-1 map t Look at this just as a new data matrix with dimension t x n*4. Apply steps 1 to 7 (page 8). The ordering can be done however seems useful Time 1 map1 map3 map5 map7 2 map2 map4 map6 map8 3 map3 map5 map7 map9 4 map4 map6 map8 map10 19

20 For seasonally varying systems, it is common to use: Time Jan Feb Mar Apr Jan Feb Mar Apr Jan Feb Mar Apr Jan Feb Mar Apr t Jan Feb Mar Apr EEOF are specially suitable for detecting patterns of systematic evolution such as propagation 20

21 Example: Fraedrich

22 Rotated EOF Used when the physical interpretation of the principal components (eigenvectors) is important. Mathematically, this process is the relaxation of the orthogonality constrain on the principal components (eigenvectors). Physically, it means that the second and further eigenvectors are chosen as physical representation of the signal present in the data set and not only as orthogonal to the previous eigenvector. Used mainly when the variance is distributed among several eigenvectors, i.e., not concentrated in the first eigenvector. Rotate or not rotate? Varimax method the most commonly used. Found in most of the statistical packages. 22

23 23

24 Important references: Bretherton et al 1992 JC Chen and Haar 1993 MWR Fraedrich et al 1993 JAS Fraedrich et al 1997 JAS Gurgel Ferreira 2003 PCA Korres et al 2000 JC EOF Uvo 2003 Luo and Yamagata 2002 Busuioc A, Chen DL, Hellstrom C. Temporal and spatial variability of precipitation in Sweden and its link with the large-scale atmospheric circulation TELLUS SERIES A- DYNAMIC METEOROLOGY AND OCEANOGRAPHY 53 (3): MAY 2001 Most references can be downloaded by ftp from air.tvrl.lth.se username: course passwd: coursetsa cd papers mget * or taken from fire/public/cintia/multivariate analysis 24

25 Report You are expected to read through the references above before doing your report on EOF. They are several examples about how to use PCA/EOF in geosciences. Describe your data. Apply EOF to your data set. Interpret your results physically. How is the explained variance distributed among the modes? How many modes you consider important, why? Can you explain physically the modes you consider important? What do they mean? Rotate your EOF (varimax) Does it improve the physical analysis of your higher modes? Can you explain physically more modes than before? Apply extended EOF to your data set (anyway you like). Does it provide any further information about the time evolution? 25

26 A general routine to perform PCA/EOF % General routine to perform PCA clear load %load your data file z = <data file>; [nt,nz] = size(z); % % check for zero mean per column on z % %Calculation of the anomalies of a matrix time vs. variable (z) anom = z - (1/size(z,1))*ones(size(z,1))*z; clear z % standardization d = diag(std(anom)); z = anom*d^(-1); % z is the standardized matrix data clear d anom % % calculation of PCA covz = cov(z); [pc,variances,explained] = pcacov(covz); h = pc; s = variances; expvar = explained; clear pc variances explained 26

27 % % Saving the explained variance fprintf('explained variance %10.2f \n',expvar(1:5) ) fid = fopen('expvar.dat','w'); fprintf(fid,'%5.2f \n',expvar); fclose(fid); % % Working on eigenvectors % Calculating and plotting the time series bk = z * h; % time series (dimension time x mode) for i=1:7 % this 7 is the number of modes ( you choose) plot(bk(:,i)') title([' time series for mode 'num2str(i) ' var explained 'num2str(expvar(i)) ]) pause end % % here eigenvectors are saved as they are, in asc format fid1 = fopen('eigenvec.dat','w'); fprintf(fid1,'%10.5f %10.5f\n',h(:,1:7)') % 7 is the number of modes fclose(fid1); 27

28 % % Calculating local variance (squared correlation btw the time % series and the original data ser) for i= 1:nz for j = 1:7 cor = corrcoef([bk(:,j) z(:,i)]); cc(i,j) = cor(1,2); % dimension variable X... mode end end clear h h = cc; % fclose('all') 28

29 Program pcarot_general.m % Routine to perform EOF using SVD % rotation is made using varmaxt.m function clear load 'data file' z = <data file>; [nt,nz] = size(z); % % check for zero mean per column on z % %Calculation of the anomalies of a matrix time vs. variable (z) anom = z - (1/size(z,1))*ones(size(z,1))*z; clear z % standardization d = diag(std(anom)); z = anom*d^(-1); % z is the standardized matrix data clear d anom % % preparing for the calculation % cross covariance matrix ccov = z' * z / (nt-1); [g,s,h] = svd(ccov,0); % diag(s(1:5,1:5)) h2 = h*sqrt(s); % % rotating [hrot,at,cscor,vrot,a] = varmaxt(h2,3,'y'); %

30 % % calculation of the new explained variance expvar = vrot/trace(s); fprintf('explained variance %10.2f \n',expvar(1:3)*100 ) fid = fopen('expvar.dat','w'); fprintf(fid,'%5.2f \n',expvar); fclose(fid); % % Calculating and ploting the rotated time series bk = z * h; % time series dimention time x mode % normalizing bk % check for zero mean on z [r,n] = size(bk); clear m m = mean(bk(:,1:3)); for ii=1:r bk(ii,1:3) = bk(ii,1:3)-m; end % standardize z sz = std(bk(:,1:3)); for ii=1:r bk(ii,1:3) = bk(ii,1:3)./sz; end clear ii sz m % rotating bkrot = bk(:,1:3) * at; for i=1:3 plot(bkrot(:,i)') title(['pcp rotated time series for mode ' num2str(i) ' var explained ' num2str(expvar(i)*100) ]) 30

31 end pause print -dps -append tseries.ps % % here I save the eigenvectors as they are, in asc format % fid1 = fopen('/usr/cintia/scand/pcpeigenvec.dat','w'); % fprintf(fid1,'%10.5f %10.5f\n',h(:,1:2)') % fclose(fid1); %return % % Calculating local variance (squared correlation btw the time % series and the original data ser) % reg_var % fclose('all') 31

Empirical Orthogonal Function (EOF) (Lorenz, 1956) Hotelling, H., 1935: The most predictable criterion. J. Ed. Phych., 26,

Principal Component Analysis (PCA) or Empirical Orthogonal Function (EOF) (Lorenz, 1956) Hotelling, H., 1935: The most predictable criterion. J. Ed. Phych., 26, 139-142. (from Jackson, 1991 and Graham,1996