Empirical Orthogonal Function (EOF) (Lorenz, 1956) Hotelling, H., 1935: The most predictable criterion. J. Ed. Phych., 26,

Size: px
Start display at page:

Download "Empirical Orthogonal Function (EOF) (Lorenz, 1956) Hotelling, H., 1935: The most predictable criterion. J. Ed. Phych., 26,"

Transcription

1 Principal Component Analysis (PCA) or Empirical Orthogonal Function (EOF) (Lorenz, 1956) Hotelling, H., 1935: The most predictable criterion. J. Ed. Phych., 26, (from Jackson, 1991 and Graham,1996 (class notes)) Jackson, J. E., 1991: A user s guide to principal components. Wiley & Sons, Inc., 569 pp. Objective: Reduce a data set containing a large number of variables to a data set containing many fewer variables, but that still represents a large fraction of the variability contained in the original data set. In a field: What are the important patterns that appears in this field and how much of the variance is explained by them? 1

2 Given a set of data X we want to find a transformation of X (say U) that has the maximum variance of all such linear transformations. Var (XU) = max x 1 x r = s ( s s ) = 0.89 x = x x 1 2 = S = s s s s =

3 If X is the data set and U the transformation, the transformed data will be: and Z = XU 2 (z ) Z' Z var(z) = = (in matrix n 1 n 1 (XU)'(XU) U' X' XU = = n 1 n 1 notation) but X' X n 1 is the variance covariance matrix S so that var(z) = U SU and we want it to be max To solve the maximization problem, 1- constrain U to have unit length, so that: U U = 1 2- define a Lagrange multiplier to allow the side constraint into the problem (Lagrange multiplier is used to find the extreme of f(x) subject to the constraint g(x) = c), so that the problem becomes: var(z) = f(u) = U SU λ(u U I) = max f(u) or = 0 *** U (S λi) U = 0 3

4 Eigenstructure of a square matrix A For a square matrix A, there is a vector E such that: AE = λe where λ is scalar. E contains the so called eigenvectors and λ the eigenvalues How to calculate λ and E? AE = λe AE λe = 0 AIE λie = 0 (A λi)e = 0 if E 0 - Solving for λ: If (A λi) has an inverse (it is non-singular and det (A λi) 0) then (A λi) -1 (A λi) E = 0 it means that E = 0 (no good) Then (A λi) must be singular, i.e., det (A λi) = 0 and λ can be determined from this equation (λ has as many values as the dimension of A). - Solving for E is just to substitute each value of λ in (A λi)e = 0 and find E for each of them. We can determine the eigenstructure of a square matrix from the so called the characteristic equation (A λi)e = 0 4

5 Comparing the characteristic equation to the equation from the maximization problem, we can see that they have the same form. (S λi) U = 0 and (A λi)e = 0 It means that the variation maximizing transformation of S is given by its eigenvectors, and the eigenmode with the largest eigenvalues gives the transformation with the largest variance. Due to U U=1, the variance represented by each eigenmode is equal to the eigenvalues and n i= 1 λ i = total variance Graphically, the eigenvectors give the axis of deformation and the eigenvalues, the magnitude

6 Applying (S λi) U = 0 to our data set: 1- Solving for λ: *** λ 0 det = λ λ 1 = λ 2 = Solving for U: *** e e = e e = 0 0 e 1 = e e = and e 2 = e e = so 1.0 E = The eigenvectors are often normalized so that they can be compared, and that is easy. en = i e i ' i e e i Each column of E is called principal component or mode. 6

7 Import property!! E is orthonornal e 1 e 1 = 1; e 2 e 2 = 1 e 1 e 2 = 0 ' i e e j = 1, i = 0,i j j The new variables u m, that will account successively for the maximum amount of variance are calculated by projecting the old variables (x) in the new axis (eigenvectors). u = E X u 1 =e 1 X is the linear combination of elements of X with the greatest variance. u 2 =e 2 X is the linear combination with the greatest variance that is uncorrelated with u 1, and so on. PCA is possible to be applied to a data set if there are substantial correlations among the variables contained in the original data set (it means redundant information). The elements of the new vectors are called the principal components and are uncorrelated (orthogonal). 7

8 In practice: Given a data set X nt,nx that has the dimension time x variable 1- Standardize or Normalize your data. 2- Extract the mean per column. For better physical interpretation, PCA is only applied on centered data (anomalies). 3- Calculate the variance covariance matrix of X (S). 4- Obtain the eigenvalues and eigenvectors of S using whatever means you like. 5- Plot your eigenvectors (map or graphic). Each explains λ*100% of the total variance. 6- Analyze your results. 8

9 Given a set of data represented as a matrix X in which each column represents a variable: X = x 1,1 x 1,2 x 1,3 x 1,m x 2,1 x 2,2 x 2,3 x 2,m x 3,1 x 3,2 x 3,3 x 3,m x n,1 x n,2 x n,3 x n,m 1- PCA can only be conducted on centered data (anomalies), i.e., the average value by column must be zero. 2- PCA (EOF) can be applied both to the covariance or to the correlation matrix but the covariance matrix can be used only when all variables have the same units (variance has unit). 3- One can always use covariance matrix if the data is normalized before. 4- The correlation matrix can be used always (correlation has no unit). 9

10 Analysis of the PC / EOF eigenvectors: New time series: projection of the original data on the new axes u = E X u 1 =e 1 X is the linear combination of elements of X with the greatest variance. u 2 =e 2 X is the linear combination with the greatest variance that is uncorrelated with u 1, and so on. 10

11 Plot the new time series and analyze them. e.g.: Fraedrich (1993) Mean zonal wind over Singapure 11

12 Maps - eigenvectors - local variance (squared correlation between the time series and the original data) If you have station data: Uvo (2003) Stn n time t Interpolate the data in grid points (griddata on Matlab) Plot the line contours 12

13 13

14 14

15 In case your data is given as grid points on a maps you have to redistribute it so the you can calculate EOF. t 1 t 2 t n t 1 t 2 t n Follow steps 1 to 6 on page 8 Return the eigenvectors to map format to plot them. Important: Before applying PCA, Replace missing data. data. Missing data should not be more than 10% of the total Exclude land grid points (in case of oceanic data) or ocean grid points (in case of land data). 15

16 Gurgel and Ferreira (2003) % mean vegetation 0.12% summer/winter 0.04% spring/autumn 0.03% semi annual cycle 16

17 Gurgel and Ferreira (2003) 17

18 Greatbatch 2000 Artic Oscillation 18

19 Extended EOF EEOF The ordinary EOF provides a snap shot of the spatially evolving behaviour of your data. Suppose you are studying a phenomenon that varies in time and space. How to solve this problem? time map 1 y 1 y 2 y 3 y n-1 y n 2 y 1 y 2 y 3 y n-1 y n 3 y 1 y 2 y 3 y n-1 y n t y 1 y 2 y 3 y n-1 y n Time 1 map1 map2 map3 map4 2 map2 map3 map4 map5 3 map3 map4 map5 map6 4 map4 map5 map6 map7 t-3 map t-3 map t-2 map t-1 map t Look at this just as a new data matrix with dimension t x n*4. Apply steps 1 to 7 (page 8). The ordering can be done however seems useful Time 1 map1 map3 map5 map7 2 map2 map4 map6 map8 3 map3 map5 map7 map9 4 map4 map6 map8 map10 19

20 For seasonally varying systems, it is common to use: Time Jan Feb Mar Apr Jan Feb Mar Apr Jan Feb Mar Apr Jan Feb Mar Apr t Jan Feb Mar Apr EEOF are specially suitable for detecting patterns of systematic evolution such as propagation 20

21 Example: Fraedrich

22 Rotated EOF Used when the physical interpretation of the principal components (eigenvectors) is important. Mathematically, this process is the relaxation of the orthogonality constrain on the principal components (eigenvectors). Physically, it means that the second and further eigenvectors are chosen as physical representation of the signal present in the data set and not only as orthogonal to the previous eigenvector. Used mainly when the variance is distributed among several eigenvectors, i.e., not concentrated in the first eigenvector. Rotate or not rotate? Varimax method the most commonly used. Found in most of the statistical packages. 22

23 23

24 Important references: Bretherton et al 1992 JC Chen and Haar 1993 MWR Fraedrich et al 1993 JAS Fraedrich et al 1997 JAS Gurgel Ferreira 2003 PCA Korres et al 2000 JC EOF Uvo 2003 Luo and Yamagata 2002 Busuioc A, Chen DL, Hellstrom C. Temporal and spatial variability of precipitation in Sweden and its link with the large-scale atmospheric circulation TELLUS SERIES A- DYNAMIC METEOROLOGY AND OCEANOGRAPHY 53 (3): MAY 2001 Most references can be downloaded by ftp from air.tvrl.lth.se username: course passwd: coursetsa cd papers mget * or taken from fire/public/cintia/multivariate analysis 24

25 Report You are expected to read through the references above before doing your report on EOF. They are several examples about how to use PCA/EOF in geosciences. Describe your data. Apply EOF to your data set. Interpret your results physically. How is the explained variance distributed among the modes? How many modes you consider important, why? Can you explain physically the modes you consider important? What do they mean? Rotate your EOF (varimax) Does it improve the physical analysis of your higher modes? Can you explain physically more modes than before? Apply extended EOF to your data set (anyway you like). Does it provide any further information about the time evolution? 25

26 A general routine to perform PCA/EOF % General routine to perform PCA clear load %load your data file z = <data file>; [nt,nz] = size(z); % % check for zero mean per column on z % %Calculation of the anomalies of a matrix time vs. variable (z) anom = z - (1/size(z,1))*ones(size(z,1))*z; clear z % standardization d = diag(std(anom)); z = anom*d^(-1); % z is the standardized matrix data clear d anom % % calculation of PCA covz = cov(z); [pc,variances,explained] = pcacov(covz); h = pc; s = variances; expvar = explained; clear pc variances explained 26

27 % % Saving the explained variance fprintf('explained variance %10.2f \n',expvar(1:5) ) fid = fopen('expvar.dat','w'); fprintf(fid,'%5.2f \n',expvar); fclose(fid); % % Working on eigenvectors % Calculating and plotting the time series bk = z * h; % time series (dimension time x mode) for i=1:7 % this 7 is the number of modes ( you choose) plot(bk(:,i)') title([' time series for mode 'num2str(i) ' var explained 'num2str(expvar(i)) ]) pause end % % here eigenvectors are saved as they are, in asc format fid1 = fopen('eigenvec.dat','w'); fprintf(fid1,'%10.5f %10.5f\n',h(:,1:7)') % 7 is the number of modes fclose(fid1); 27

28 % % Calculating local variance (squared correlation btw the time % series and the original data ser) for i= 1:nz for j = 1:7 cor = corrcoef([bk(:,j) z(:,i)]); cc(i,j) = cor(1,2); % dimension variable X... mode end end clear h h = cc; % fclose('all') 28

29 Program pcarot_general.m % Routine to perform EOF using SVD % rotation is made using varmaxt.m function clear load 'data file' z = <data file>; [nt,nz] = size(z); % % check for zero mean per column on z % %Calculation of the anomalies of a matrix time vs. variable (z) anom = z - (1/size(z,1))*ones(size(z,1))*z; clear z % standardization d = diag(std(anom)); z = anom*d^(-1); % z is the standardized matrix data clear d anom % % preparing for the calculation % cross covariance matrix ccov = z' * z / (nt-1); [g,s,h] = svd(ccov,0); % diag(s(1:5,1:5)) h2 = h*sqrt(s); % % rotating [hrot,at,cscor,vrot,a] = varmaxt(h2,3,'y'); %

30 % % calculation of the new explained variance expvar = vrot/trace(s); fprintf('explained variance %10.2f \n',expvar(1:3)*100 ) fid = fopen('expvar.dat','w'); fprintf(fid,'%5.2f \n',expvar); fclose(fid); % % Calculating and ploting the rotated time series bk = z * h; % time series dimention time x mode % normalizing bk % check for zero mean on z [r,n] = size(bk); clear m m = mean(bk(:,1:3)); for ii=1:r bk(ii,1:3) = bk(ii,1:3)-m; end % standardize z sz = std(bk(:,1:3)); for ii=1:r bk(ii,1:3) = bk(ii,1:3)./sz; end clear ii sz m % rotating bkrot = bk(:,1:3) * at; for i=1:3 plot(bkrot(:,i)') title(['pcp rotated time series for mode ' num2str(i) ' var explained ' num2str(expvar(i)*100) ]) 30

31 end pause print -dps -append tseries.ps % % here I save the eigenvectors as they are, in asc format % fid1 = fopen('/usr/cintia/scand/pcpeigenvec.dat','w'); % fprintf(fid1,'%10.5f %10.5f\n',h(:,1:2)') % fclose(fid1); %return % % Calculating local variance (squared correlation btw the time % series and the original data ser) % reg_var % fclose('all') 31

Empirical Orthogonal Function (EOF) (Lorenz, 1956) Hotelling, H., 1935: The most predictable criterion. J. Ed. Phych., 26,

Empirical Orthogonal Function (EOF) (Lorenz, 1956) Hotelling, H., 1935: The most predictable criterion. J. Ed. Phych., 26, Principal Component Analysis (PCA) or Empirical Orthogonal Function (EOF) (Lorenz, 1956) Hotelling, H., 1935: The most predictable criterion. J. Ed. Phych., 26, 139-142. (from Jackson, 1991 and Graham,1996

More information

Computation. For QDA we need to calculate: Lets first consider the case that

Computation. For QDA we need to calculate: Lets first consider the case that Computation For QDA we need to calculate: δ (x) = 1 2 log( Σ ) 1 2 (x µ ) Σ 1 (x µ ) + log(π ) Lets first consider the case that Σ = I,. This is the case where each distribution is spherical, around the

More information

e 2 e 1 (a) (b) (d) (c)

e 2 e 1 (a) (b) (d) (c) 2.13 Rotated principal component analysis [Book, Sect. 2.2] Fig.: PCA applied to a dataset composed of (a) 1 cluster, (b) 2 clusters, (c) and (d) 4 clusters. In (c), an orthonormal rotation and (d) an

More information

Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA

Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA Principle Components Analysis: Uses one group of variables (we will call this X) In

More information

Problems with EOF (unrotated)

Problems with EOF (unrotated) Rotated EOFs: When the domain sizes are larger than optimal for conventional EOF analysis but still small enough so that the real structure in the data is not completely obscured by sampling variability,

More information

Francina Dominguez*, Praveen Kumar Department of Civil and Environmental Engineering University of Illinois at Urbana-Champaign

Francina Dominguez*, Praveen Kumar Department of Civil and Environmental Engineering University of Illinois at Urbana-Champaign P1.8 MODES OF INTER-ANNUAL VARIABILITY OF ATMOSPHERIC MOISTURE FLUX TRANSPORT Francina Dominguez*, Praveen Kumar Department of Civil and Environmental Engineering University of Illinois at Urbana-Champaign

More information

PRINCIPAL COMPONENT ANALYSIS

PRINCIPAL COMPONENT ANALYSIS PRINCIPAL COMPONENT ANALYSIS 1 INTRODUCTION One of the main problems inherent in statistics with more than two variables is the issue of visualising or interpreting data. Fortunately, quite often the problem

More information

Downscaling & Record-Statistics

Downscaling & Record-Statistics Empirical-Statistical Downscaling & Record-Statistics R.E. Benestad Rasmus.benestad@met.no Outline! " # $%&''(&#)&* $ & +, -.!, - % $! /, 0 /0 Principles of Downscaling Why downscaling? Interpolated Temperatures

More information

What is Principal Component Analysis?

What is Principal Component Analysis? What is Principal Component Analysis? Principal component analysis (PCA) Reduce the dimensionality of a data set by finding a new set of variables, smaller than the original set of variables Retains most

More information

Principal Components Analysis (PCA)

Principal Components Analysis (PCA) Principal Components Analysis (PCA) Principal Components Analysis (PCA) a technique for finding patterns in data of high dimension Outline:. Eigenvectors and eigenvalues. PCA: a) Getting the data b) Centering

More information

1 Principal Components Analysis

1 Principal Components Analysis Lecture 3 and 4 Sept. 18 and Sept.20-2006 Data Visualization STAT 442 / 890, CM 462 Lecture: Ali Ghodsi 1 Principal Components Analysis Principal components analysis (PCA) is a very popular technique for

More information

Ch.3 Canonical correlation analysis (CCA) [Book, Sect. 2.4]

Ch.3 Canonical correlation analysis (CCA) [Book, Sect. 2.4] Ch.3 Canonical correlation analysis (CCA) [Book, Sect. 2.4] With 2 sets of variables {x i } and {y j }, canonical correlation analysis (CCA), first introduced by Hotelling (1936), finds the linear modes

More information

Principal Component Analysis

Principal Component Analysis Principal Component Analysis Yingyu Liang yliang@cs.wisc.edu Computer Sciences Department University of Wisconsin, Madison [based on slides from Nina Balcan] slide 1 Goals for the lecture you should understand

More information

On Sampling Errors in Empirical Orthogonal Functions

On Sampling Errors in Empirical Orthogonal Functions 3704 J O U R N A L O F C L I M A T E VOLUME 18 On Sampling Errors in Empirical Orthogonal Functions ROBERTA QUADRELLI, CHRISTOPHER S. BRETHERTON, AND JOHN M. WALLACE University of Washington, Seattle,

More information

Principal Component Analysis (PCA) Principal Component Analysis (PCA)

Principal Component Analysis (PCA) Principal Component Analysis (PCA) Recall: Eigenvectors of the Covariance Matrix Covariance matrices are symmetric. Eigenvectors are orthogonal Eigenvectors are ordered by the magnitude of eigenvalues: λ 1 λ 2 λ p {v 1, v 2,..., v n } Recall:

More information

1 Data Arrays and Decompositions

1 Data Arrays and Decompositions 1 Data Arrays and Decompositions 1.1 Variance Matrices and Eigenstructure Consider a p p positive definite and symmetric matrix V - a model parameter or a sample variance matrix. The eigenstructure is

More information

Principal Component Analysis (PCA) Our starting point consists of T observations from N variables, which will be arranged in an T N matrix R,

Principal Component Analysis (PCA) Our starting point consists of T observations from N variables, which will be arranged in an T N matrix R, Principal Component Analysis (PCA) PCA is a widely used statistical tool for dimension reduction. The objective of PCA is to find common factors, the so called principal components, in form of linear combinations

More information

7. Variable extraction and dimensionality reduction

7. Variable extraction and dimensionality reduction 7. Variable extraction and dimensionality reduction The goal of the variable selection in the preceding chapter was to find least useful variables so that it would be possible to reduce the dimensionality

More information

A non-gaussian decomposition of Total Water Storage (TWS), using Independent Component Analysis (ICA)

A non-gaussian decomposition of Total Water Storage (TWS), using Independent Component Analysis (ICA) Titelmaster A non-gaussian decomposition of Total Water Storage (TWS, using Independent Component Analysis (ICA Ehsan Forootan and Jürgen Kusche Astronomical Physical & Mathematical Geodesy, Bonn University

More information

Eigenvalues, Eigenvectors, and an Intro to PCA

Eigenvalues, Eigenvectors, and an Intro to PCA Eigenvalues, Eigenvectors, and an Intro to PCA Eigenvalues, Eigenvectors, and an Intro to PCA Changing Basis We ve talked so far about re-writing our data using a new set of variables, or a new basis.

More information

Eigenvalues, Eigenvectors, and an Intro to PCA

Eigenvalues, Eigenvectors, and an Intro to PCA Eigenvalues, Eigenvectors, and an Intro to PCA Eigenvalues, Eigenvectors, and an Intro to PCA Changing Basis We ve talked so far about re-writing our data using a new set of variables, or a new basis.

More information

Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data.

Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data. Structure in Data A major objective in data analysis is to identify interesting features or structure in the data. The graphical methods are very useful in discovering structure. There are basically two

More information

Principal Component Analysis. Applied Multivariate Statistics Spring 2012

Principal Component Analysis. Applied Multivariate Statistics Spring 2012 Principal Component Analysis Applied Multivariate Statistics Spring 2012 Overview Intuition Four definitions Practical examples Mathematical example Case study 2 PCA: Goals Goal 1: Dimension reduction

More information

Dimensionality Reduction

Dimensionality Reduction Lecture 5 1 Outline 1. Overview a) What is? b) Why? 2. Principal Component Analysis (PCA) a) Objectives b) Explaining variability c) SVD 3. Related approaches a) ICA b) Autoencoders 2 Example 1: Sportsball

More information

Chap.11 Nonlinear principal component analysis [Book, Chap. 10]

Chap.11 Nonlinear principal component analysis [Book, Chap. 10] Chap.11 Nonlinear principal component analysis [Book, Chap. 1] We have seen machine learning methods nonlinearly generalizing the linear regression method. Now we will examine ways to nonlinearly generalize

More information

Linear Algebra Methods for Data Mining

Linear Algebra Methods for Data Mining Linear Algebra Methods for Data Mining Saara Hyvönen, Saara.Hyvonen@cs.helsinki.fi Spring 2007 Linear Discriminant Analysis Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki Principal

More information

Linear Dimensionality Reduction

Linear Dimensionality Reduction Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Principal Component Analysis 3 Factor Analysis

More information

GEOG 4110/5100 Advanced Remote Sensing Lecture 15

GEOG 4110/5100 Advanced Remote Sensing Lecture 15 GEOG 4110/5100 Advanced Remote Sensing Lecture 15 Principal Component Analysis Relevant reading: Richards. Chapters 6.3* http://www.ce.yildiz.edu.tr/personal/songul/file/1097/principal_components.pdf *For

More information

statistical methods for tailoring seasonal climate forecasts Andrew W. Robertson, IRI

statistical methods for tailoring seasonal climate forecasts Andrew W. Robertson, IRI statistical methods for tailoring seasonal climate forecasts Andrew W. Robertson, IRI tailored seasonal forecasts why do we make probabilistic forecasts? to reduce our uncertainty about the (unknown) future

More information

SPATIAL AND TEMPORAL DISTRIBUTION OF AIR TEMPERATURE IN ΤΗΕ NORTHERN HEMISPHERE

SPATIAL AND TEMPORAL DISTRIBUTION OF AIR TEMPERATURE IN ΤΗΕ NORTHERN HEMISPHERE Global Nest: the Int. J. Vol 6, No 3, pp 177-182, 2004 Copyright 2004 GLOBAL NEST Printed in Greece. All rights reserved SPATIAL AND TEMPORAL DISTRIBUTION OF AIR TEMPERATURE IN ΤΗΕ NORTHERN HEMISPHERE

More information

Covariance and Correlation Matrix

Covariance and Correlation Matrix Covariance and Correlation Matrix Given sample {x n } N 1, where x Rd, x n = x 1n x 2n. x dn sample mean x = 1 N N n=1 x n, and entries of sample mean are x i = 1 N N n=1 x in sample covariance matrix

More information

PRINCIPAL COMPONENTS ANALYSIS (PCA)

PRINCIPAL COMPONENTS ANALYSIS (PCA) PRINCIPAL COMPONENTS ANALYSIS (PCA) Introduction PCA is considered an exploratory technique that can be used to gain a better understanding of the interrelationships between variables. PCA is performed

More information

Space-time data. Simple space-time analyses. PM10 in space. PM10 in time

Space-time data. Simple space-time analyses. PM10 in space. PM10 in time Space-time data Observations taken over space and over time Z(s, t): indexed by space, s, and time, t Here, consider geostatistical/time data Z(s, t) exists for all locations and all times May consider

More information

CS281 Section 4: Factor Analysis and PCA

CS281 Section 4: Factor Analysis and PCA CS81 Section 4: Factor Analysis and PCA Scott Linderman At this point we have seen a variety of machine learning models, with a particular emphasis on models for supervised learning. In particular, we

More information

PCA, Kernel PCA, ICA

PCA, Kernel PCA, ICA PCA, Kernel PCA, ICA Learning Representations. Dimensionality Reduction. Maria-Florina Balcan 04/08/2015 Big & High-Dimensional Data High-Dimensions = Lot of Features Document classification Features per

More information

E = UV W (9.1) = I Q > V W

E = UV W (9.1) = I Q > V W 91 9. EOFs, SVD A common statistical tool in oceanography, meteorology and climate research are the so-called empirical orthogonal functions (EOFs). Anyone, in any scientific field, working with large

More information

Principal Component Analysis

Principal Component Analysis I.T. Jolliffe Principal Component Analysis Second Edition With 28 Illustrations Springer Contents Preface to the Second Edition Preface to the First Edition Acknowledgments List of Figures List of Tables

More information

Covariance to PCA. CS 510 Lecture #8 February 17, 2014

Covariance to PCA. CS 510 Lecture #8 February 17, 2014 Covariance to PCA CS 510 Lecture 8 February 17, 2014 Status Update Programming Assignment 2 is due March 7 th Expect questions about your progress at the start of class I still owe you Assignment 1 back

More information

Principal Component Analysis of Sea Surface Temperature via Singular Value Decomposition

Principal Component Analysis of Sea Surface Temperature via Singular Value Decomposition Principal Component Analysis of Sea Surface Temperature via Singular Value Decomposition SYDE 312 Final Project Ziyad Mir, 20333385 Jennifer Blight, 20347163 Faculty of Engineering Department of Systems

More information

Machine Learning 2nd Edition

Machine Learning 2nd Edition INTRODUCTION TO Lecture Slides for Machine Learning 2nd Edition ETHEM ALPAYDIN, modified by Leonardo Bobadilla and some parts from http://www.cs.tau.ac.il/~apartzin/machinelearning/ The MIT Press, 2010

More information

MLCC 2015 Dimensionality Reduction and PCA

MLCC 2015 Dimensionality Reduction and PCA MLCC 2015 Dimensionality Reduction and PCA Lorenzo Rosasco UNIGE-MIT-IIT June 25, 2015 Outline PCA & Reconstruction PCA and Maximum Variance PCA and Associated Eigenproblem Beyond the First Principal Component

More information

SIO 211B, Rudnick, adapted from Davis 1

SIO 211B, Rudnick, adapted from Davis 1 SIO 211B, Rudnick, adapted from Davis 1 XVII.Empirical orthogonal functions Often in oceanography we collect large data sets that are time series at a group of locations. Moored current meter arrays do

More information

Covariance to PCA. CS 510 Lecture #14 February 23, 2018

Covariance to PCA. CS 510 Lecture #14 February 23, 2018 Covariance to PCA CS 510 Lecture 14 February 23, 2018 Overview: Goal Assume you have a gallery (database) of images, and a probe (test) image. The goal is to find the database image that is most similar

More information

Principal Component Analysis (PCA) CSC411/2515 Tutorial

Principal Component Analysis (PCA) CSC411/2515 Tutorial Principal Component Analysis (PCA) CSC411/2515 Tutorial Harris Chan Based on previous tutorial slides by Wenjie Luo, Ladislav Rampasek University of Toronto hchan@cs.toronto.edu October 19th, 2017 (UofT)

More information

Interannual Teleconnection between Ural-Siberian Blocking and the East Asian Winter Monsoon

Interannual Teleconnection between Ural-Siberian Blocking and the East Asian Winter Monsoon Interannual Teleconnection between Ural-Siberian Blocking and the East Asian Winter Monsoon Hoffman H. N. Cheung 1,2, Wen Zhou 1,2 (hoffmancheung@gmail.com) 1 City University of Hong Kong Shenzhen Institute

More information

Kernel-Based Principal Component Analysis (KPCA) and Its Applications. Nonlinear PCA

Kernel-Based Principal Component Analysis (KPCA) and Its Applications. Nonlinear PCA Kernel-Based Principal Component Analysis (KPCA) and Its Applications 4//009 Based on slides originaly from Dr. John Tan 1 Nonlinear PCA Natural phenomena are usually nonlinear and standard PCA is intrinsically

More information

Principal Component Analysis

Principal Component Analysis CSci 5525: Machine Learning Dec 3, 2008 The Main Idea Given a dataset X = {x 1,..., x N } The Main Idea Given a dataset X = {x 1,..., x N } Find a low-dimensional linear projection The Main Idea Given

More information

12.4 Known Channel (Water-Filling Solution)

12.4 Known Channel (Water-Filling Solution) ECEn 665: Antennas and Propagation for Wireless Communications 54 2.4 Known Channel (Water-Filling Solution) The channel scenarios we have looed at above represent special cases for which the capacity

More information

A User's Guide To Principal Components

A User's Guide To Principal Components A User's Guide To Principal Components J. EDWARD JACKSON A Wiley-Interscience Publication JOHN WILEY & SONS, INC. New York Chichester Brisbane Toronto Singapore Contents Preface Introduction 1. Getting

More information

Principal Component Analysis

Principal Component Analysis Principal Component Analysis Giorgos Korfiatis Alfa-Informatica University of Groningen Seminar in Statistics and Methodology, 2007 What Is PCA? Dimensionality reduction technique Aim: Extract relevant

More information

Basics of Multivariate Modelling and Data Analysis

Basics of Multivariate Modelling and Data Analysis Basics of Multivariate Modelling and Data Analysis Kurt-Erik Häggblom 6. Principal component analysis (PCA) 6.1 Overview 6.2 Essentials of PCA 6.3 Numerical calculation of PCs 6.4 Effects of data preprocessing

More information

Central limit theorem - go to web applet

Central limit theorem - go to web applet Central limit theorem - go to web applet Correlation maps vs. regression maps PNA is a time series of fluctuations in 500 mb heights PNA = 0.25 * [ Z(20N,160W) - Z(45N,165W) + Z(55N,115W) - Z(30N,85W)

More information

Principal Component Analysis

Principal Component Analysis Principal Component Analysis CS5240 Theoretical Foundations in Multimedia Leow Wee Kheng Department of Computer Science School of Computing National University of Singapore Leow Wee Kheng (NUS) Principal

More information

Principal Component Analysis Utilizing R and SAS Software s

Principal Component Analysis Utilizing R and SAS Software s International Journal of Current Microbiology and Applied Sciences ISSN: 2319-7706 Volume 7 Number 05 (2018) Journal homepage: http://www.ijcmas.com Original Research Article https://doi.org/10.20546/ijcmas.2018.705.441

More information

Investigate the influence of the Amazon rainfall on westerly wind anomalies and the 2002 Atlantic Nino using QuikScat, Altimeter and TRMM data

Investigate the influence of the Amazon rainfall on westerly wind anomalies and the 2002 Atlantic Nino using QuikScat, Altimeter and TRMM data Investigate the influence of the Amazon rainfall on westerly wind anomalies and the 2002 Atlantic Nino using QuikScat, Altimeter and TRMM data Rong Fu 1, Mike Young 1, Hui Wang 2, Weiqing Han 3 1 School

More information

University of Florida Department of Geography GEO 3280 Assignment 3

University of Florida Department of Geography GEO 3280 Assignment 3 G E O 3 2 8 A s s i g n m e n t # 3 Page 1 University of Florida Department of Geography GEO 328 Assignment 3 Modeling Precipitation and Elevation Solar Radiation Precipitation Evapo- Transpiration Vegetation

More information

Independent Component Analysis and Its Application on Accelerator Physics

Independent Component Analysis and Its Application on Accelerator Physics Independent Component Analysis and Its Application on Accelerator Physics Xiaoying Pang LA-UR-12-20069 ICA and PCA Similarities: Blind source separation method (BSS) no model Observed signals are linear

More information

THE INFLUENCE OF EUROPEAN CLIMATE VARIABILITY MECHANISM ON AIR TEMPERATURES IN ROMANIA. Nicoleta Ionac 1, Monica Matei 2

THE INFLUENCE OF EUROPEAN CLIMATE VARIABILITY MECHANISM ON AIR TEMPERATURES IN ROMANIA. Nicoleta Ionac 1, Monica Matei 2 DOI 10.2478/pesd-2014-0001 PESD, VOL. 8, no. 1, 2014 THE INFLUENCE OF EUROPEAN CLIMATE VARIABILITY MECHANISM ON AIR TEMPERATURES IN ROMANIA Nicoleta Ionac 1, Monica Matei 2 Key words: European climate

More information

Principal Component Analysis, A Powerful Scoring Technique

Principal Component Analysis, A Powerful Scoring Technique Principal Component Analysis, A Powerful Scoring Technique George C. J. Fernandez, University of Nevada - Reno, Reno NV 89557 ABSTRACT Data mining is a collection of analytical techniques to uncover new

More information

Principal Component Analysis-I Geog 210C Introduction to Spatial Data Analysis. Chris Funk. Lecture 17

Principal Component Analysis-I Geog 210C Introduction to Spatial Data Analysis. Chris Funk. Lecture 17 Principal Component Analysis-I Geog 210C Introduction to Spatial Data Analysis Chris Funk Lecture 17 Outline Filters and Rotations Generating co-varying random fields Translating co-varying fields into

More information

Intraseasonal and Seasonally Persisting Patterns of Indian Monsoon Rainfall

Intraseasonal and Seasonally Persisting Patterns of Indian Monsoon Rainfall 1 JANUARY 2007 K R I S H N A M U R T H Y A N D S H U K L A 3 Intraseasonal and Seasonally Persisting Patterns of Indian Monsoon Rainfall V. KRISHNAMURTHY AND J. SHUKLA Center for Ocean Land Atmosphere

More information

(Mathematical Operations with Arrays) Applied Linear Algebra in Geoscience Using MATLAB

(Mathematical Operations with Arrays) Applied Linear Algebra in Geoscience Using MATLAB Applied Linear Algebra in Geoscience Using MATLAB (Mathematical Operations with Arrays) Contents Getting Started Matrices Creating Arrays Linear equations Mathematical Operations with Arrays Using Script

More information

EMPIRICAL ORTHOGONAL FUNCTION (EOF) ANALYSIS OF PRECIPITATION OVER GHANA

EMPIRICAL ORTHOGONAL FUNCTION (EOF) ANALYSIS OF PRECIPITATION OVER GHANA International Journal of Statistics: Advances in Theory and Applications Vol. 1, Issue 2, 2017, Pages 121-141 Published Online on September 14, 2017 2017 Jyoti Academic Press http://jyotiacademicpress.org

More information

The Maritime Continent as a Prediction Barrier

The Maritime Continent as a Prediction Barrier The Maritime Continent as a Prediction Barrier for the MJO Augustin Vintzileos EMC/NCEP SAIC Points to take back home. Forecast of the MJO is at, average, skillful for lead times of up to circa 2 weeks.

More information

Machine Learning. Principal Components Analysis. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012

Machine Learning. Principal Components Analysis. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012 Machine Learning CSE6740/CS7641/ISYE6740, Fall 2012 Principal Components Analysis Le Song Lecture 22, Nov 13, 2012 Based on slides from Eric Xing, CMU Reading: Chap 12.1, CB book 1 2 Factor or Component

More information

1 A factor can be considered to be an underlying latent variable: (a) on which people differ. (b) that is explained by unknown variables

1 A factor can be considered to be an underlying latent variable: (a) on which people differ. (b) that is explained by unknown variables 1 A factor can be considered to be an underlying latent variable: (a) on which people differ (b) that is explained by unknown variables (c) that cannot be defined (d) that is influenced by observed variables

More information

CHINESE JOURNAL OF GEOPHYSICS. Analysis of the characteristic time scale during ENSO. LIU Lin 1,2, YU Wei2Dong 2

CHINESE JOURNAL OF GEOPHYSICS. Analysis of the characteristic time scale during ENSO. LIU Lin 1,2, YU Wei2Dong 2 49 1 2006 1 CHINESE JOURNAL OF GEOPHYSICS Vol. 49, No. 1 Jan., 2006,. ENSO., 2006, 49 (1) : 45 51 Liu L, Yu W D. Analysis of the characteristic time scale during ENSO. Chinese J. Geophys. (in Chinese),

More information

CS 340 Lec. 6: Linear Dimensionality Reduction

CS 340 Lec. 6: Linear Dimensionality Reduction CS 340 Lec. 6: Linear Dimensionality Reduction AD January 2011 AD () January 2011 1 / 46 Linear Dimensionality Reduction Introduction & Motivation Brief Review of Linear Algebra Principal Component Analysis

More information

Multivariate Statistics (I) 2. Principal Component Analysis (PCA)

Multivariate Statistics (I) 2. Principal Component Analysis (PCA) Multivariate Statistics (I) 2. Principal Component Analysis (PCA) 2.1 Comprehension of PCA 2.2 Concepts of PCs 2.3 Algebraic derivation of PCs 2.4 Selection and goodness-of-fit of PCs 2.5 Algebraic derivation

More information

Principal component analysis

Principal component analysis Principal component analysis Angela Montanari 1 Introduction Principal component analysis (PCA) is one of the most popular multivariate statistical methods. It was first introduced by Pearson (1901) and

More information

Maximum variance formulation

Maximum variance formulation 12.1. Principal Component Analysis 561 Figure 12.2 Principal component analysis seeks a space of lower dimensionality, known as the principal subspace and denoted by the magenta line, such that the orthogonal

More information

Introduction to Machine Learning

Introduction to Machine Learning 10-701 Introduction to Machine Learning PCA Slides based on 18-661 Fall 2018 PCA Raw data can be Complex, High-dimensional To understand a phenomenon we measure various related quantities If we knew what

More information

*Department Statistics and Operations Research (UPC) ** Department of Economics and Economic History (UAB)

*Department Statistics and Operations Research (UPC) ** Department of Economics and Economic History (UAB) Wind power: Exploratory space-time analysis with M. P. Muñoz*, J. A. Sànchez*, M. Gasulla*, M. D. Márquez** *Department Statistics and Operations Research (UPC) ** Department of Economics and Economic

More information

VAR Model. (k-variate) VAR(p) model (in the Reduced Form): Y t-2. Y t-1 = A + B 1. Y t + B 2. Y t-p. + ε t. + + B p. where:

VAR Model. (k-variate) VAR(p) model (in the Reduced Form): Y t-2. Y t-1 = A + B 1. Y t + B 2. Y t-p. + ε t. + + B p. where: VAR Model (k-variate VAR(p model (in the Reduced Form: where: Y t = A + B 1 Y t-1 + B 2 Y t-2 + + B p Y t-p + ε t Y t = (y 1t, y 2t,, y kt : a (k x 1 vector of time series variables A: a (k x 1 vector

More information

The Singular Value Decomposition (SVD) and Principal Component Analysis (PCA)

The Singular Value Decomposition (SVD) and Principal Component Analysis (PCA) Chapter 5 The Singular Value Decomposition (SVD) and Principal Component Analysis (PCA) 5.1 Basics of SVD 5.1.1 Review of Key Concepts We review some key definitions and results about matrices that will

More information

Next is material on matrix rank. Please see the handout

Next is material on matrix rank. Please see the handout B90.330 / C.005 NOTES for Wednesday 0.APR.7 Suppose that the model is β + ε, but ε does not have the desired variance matrix. Say that ε is normal, but Var(ε) σ W. The form of W is W w 0 0 0 0 0 0 w 0

More information

Principal Component Analysis & Factor Analysis. Psych 818 DeShon

Principal Component Analysis & Factor Analysis. Psych 818 DeShon Principal Component Analysis & Factor Analysis Psych 818 DeShon Purpose Both are used to reduce the dimensionality of correlated measurements Can be used in a purely exploratory fashion to investigate

More information

A study of the impacts of late spring Tibetan Plateau snow cover on Chinese early autumn precipitation

A study of the impacts of late spring Tibetan Plateau snow cover on Chinese early autumn precipitation N U I S T Nanjing University of Information Science & Technology A study of the impacts of late spring Tibetan Plateau snow cover on Chinese early autumn precipitation JIANG Zhihong,HUO Fei,LIU Zhengyu

More information

Lecture 4: Principal Component Analysis and Linear Dimension Reduction

Lecture 4: Principal Component Analysis and Linear Dimension Reduction Lecture 4: Principal Component Analysis and Linear Dimension Reduction Advanced Applied Multivariate Analysis STAT 2221, Fall 2013 Sungkyu Jung Department of Statistics University of Pittsburgh E-mail:

More information

Semiblind Source Separation of Climate Data Detects El Niño as the Component with the Highest Interannual Variability

Semiblind Source Separation of Climate Data Detects El Niño as the Component with the Highest Interannual Variability Semiblind Source Separation of Climate Data Detects El Niño as the Component with the Highest Interannual Variability Alexander Ilin Neural Networks Research Centre Helsinki University of Technology P.O.

More information

4. Matrix Methods for Analysis of Structure in Data Sets:

4. Matrix Methods for Analysis of Structure in Data Sets: ATM 552 Notes: Matrix Methods: EOF, SVD, ETC. D.L.Hartmann Page 68 4. Matrix Methods for Analysis of Structure in Data Sets: Empirical Orthogonal Functions, Principal Component Analysis, Singular Value

More information

(Linear equations) Applied Linear Algebra in Geoscience Using MATLAB

(Linear equations) Applied Linear Algebra in Geoscience Using MATLAB Applied Linear Algebra in Geoscience Using MATLAB (Linear equations) Contents Getting Started Creating Arrays Mathematical Operations with Arrays Using Script Files and Managing Data Two-Dimensional Plots

More information

CSE 554 Lecture 7: Alignment

CSE 554 Lecture 7: Alignment CSE 554 Lecture 7: Alignment Fall 2012 CSE554 Alignment Slide 1 Review Fairing (smoothing) Relocating vertices to achieve a smoother appearance Method: centroid averaging Simplification Reducing vertex

More information

Principal Components Theory Notes

Principal Components Theory Notes Principal Components Theory Notes Charles J. Geyer August 29, 2007 1 Introduction These are class notes for Stat 5601 (nonparametrics) taught at the University of Minnesota, Spring 2006. This not a theory

More information

More Linear Algebra. Edps/Soc 584, Psych 594. Carolyn J. Anderson

More Linear Algebra. Edps/Soc 584, Psych 594. Carolyn J. Anderson More Linear Algebra Edps/Soc 584, Psych 594 Carolyn J. Anderson Department of Educational Psychology I L L I N O I S university of illinois at urbana-champaign c Board of Trustees, University of Illinois

More information

Principal Component Analysis (PCA) of AIRS Data

Principal Component Analysis (PCA) of AIRS Data Principal Component Analysis (PCA) of AIRS Data Mitchell D. Goldberg 1, Lihang Zhou 2, Walter Wolf 2 and Chris Barnet 1 NOAA/NESDIS/Office of Research and Applications, Camp Springs, MD 1 QSS Group Inc.

More information

Multivariate Statistical Analysis

Multivariate Statistical Analysis Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 4 for Applied Multivariate Analysis Outline 1 Eigen values and eigen vectors Characteristic equation Some properties of eigendecompositions

More information

Karhunen-Loève Transform KLT. JanKees van der Poel D.Sc. Student, Mechanical Engineering

Karhunen-Loève Transform KLT. JanKees van der Poel D.Sc. Student, Mechanical Engineering Karhunen-Loève Transform KLT JanKees van der Poel D.Sc. Student, Mechanical Engineering Karhunen-Loève Transform Has many names cited in literature: Karhunen-Loève Transform (KLT); Karhunen-Loève Decomposition

More information

Machine learning for pervasive systems Classification in high-dimensional spaces

Machine learning for pervasive systems Classification in high-dimensional spaces Machine learning for pervasive systems Classification in high-dimensional spaces Department of Communications and Networking Aalto University, School of Electrical Engineering stephan.sigg@aalto.fi Version

More information

Linear Algebra & Geometry why is linear algebra useful in computer vision?

Linear Algebra & Geometry why is linear algebra useful in computer vision? Linear Algebra & Geometry why is linear algebra useful in computer vision? References: -Any book on linear algebra! -[HZ] chapters 2, 4 Some of the slides in this lecture are courtesy to Prof. Octavia

More information

Potential of Equatorial Atlantic Variability to Enhance El Niño Prediction

Potential of Equatorial Atlantic Variability to Enhance El Niño Prediction 1 Supplementary Material Potential of Equatorial Atlantic Variability to Enhance El Niño Prediction N. S. Keenlyside 1, Hui Ding 2, and M. Latif 2,3 1 Geophysical Institute and Bjerknes Centre, University

More information

UCLA STAT 233 Statistical Methods in Biomedical Imaging

UCLA STAT 233 Statistical Methods in Biomedical Imaging UCLA STAT 233 Statistical Methods in Biomedical Imaging Instructor: Ivo Dinov, Asst. Prof. In Statistics and Neurology University of California, Los Angeles, Spring 2004 http://www.stat.ucla.edu/~dinov/

More information

CHAPTER 2 DATA AND METHODS. Errors using inadequate data are much less than those using no data at all. Charles Babbage, circa 1850

CHAPTER 2 DATA AND METHODS. Errors using inadequate data are much less than those using no data at all. Charles Babbage, circa 1850 CHAPTER 2 DATA AND METHODS Errors using inadequate data are much less than those using no data at all. Charles Babbage, circa 185 2.1 Datasets 2.1.1 OLR The primary data used in this study are the outgoing

More information

Linear Methods in Data Mining

Linear Methods in Data Mining Why Methods? linear methods are well understood, simple and elegant; algorithms based on linear methods are widespread: data mining, computer vision, graphics, pattern recognition; excellent general software

More information

L3: Review of linear algebra and MATLAB

L3: Review of linear algebra and MATLAB L3: Review of linear algebra and MATLAB Vector and matrix notation Vectors Matrices Vector spaces Linear transformations Eigenvalues and eigenvectors MATLAB primer CSCE 666 Pattern Analysis Ricardo Gutierrez-Osuna

More information

UWM Field Station meteorological data

UWM Field Station meteorological data University of Wisconsin Milwaukee UWM Digital Commons Field Station Bulletins UWM Field Station Spring 992 UWM Field Station meteorological data James W. Popp University of Wisconsin - Milwaukee Follow

More information

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas Dimensionality Reduction: PCA Nicholas Ruozzi University of Texas at Dallas Eigenvalues λ is an eigenvalue of a matrix A R n n if the linear system Ax = λx has at least one non-zero solution If Ax = λx

More information

Background Mathematics (2/2) 1. David Barber

Background Mathematics (2/2) 1. David Barber Background Mathematics (2/2) 1 David Barber University College London Modified by Samson Cheung (sccheung@ieee.org) 1 These slides accompany the book Bayesian Reasoning and Machine Learning. The book and

More information

Introduction to Principal Component Analysis (PCA)

Introduction to Principal Component Analysis (PCA) Introduction to Principal Component Analysis (PCA) NESAC/BIO NESAC/BIO Daniel J. Graham PhD University of Washington NESAC/BIO MVSA Website 2010 Multivariate Analysis Multivariate analysis (MVA) methods

More information

Annex I to Target Area Assessments

Annex I to Target Area Assessments Baltic Challenges and Chances for local and regional development generated by Climate Change Annex I to Target Area Assessments Climate Change Support Material (Climate Change Scenarios) SWEDEN September

More information