Principal Dynamical Components

Size: px

Start display at page:

Download "Principal Dynamical Components"

Horatio Haynes
5 years ago
Views:

1 Principal Dynamical Components Manuel D. de la Iglesia* Departamento de Análisis Matemático, Universidad de Sevilla Instituto Nacional de Matemática Pura e Aplicada (IMPA) Rio de Janeiro, May 14, 2013 *Joint work with Esteban G. Tabak, Courant Institute

2 OUTLINE

3 OUTLINE 1. Principal component analysis (PCA) and Autorregresive models (AR(p))

4 OUTLINE 1. Principal component analysis (PCA) and Autorregresive models (AR(p)) 2. Principal dynamical components (PDC)

5 OUTLINE 1. Principal component analysis (PCA) and Autorregresive models (AR(p)) 2. Principal dynamical components (PDC) 3. Application to the Global Sea-Surface Temperature Field

6 PRINCIPAL COMPONENT ANALYSIS (PCA) Suppose that z 2 R n is a vector of random variables, and that the variances of the n random variables and the structure of the covariances between the n variables are of interest.

7 PRINCIPAL COMPONENT ANALYSIS (PCA) Suppose that z 2 R n is a vector of random variables, and that the variances of the n random variables and the structure of the covariances between the n variables are of interest. Given N independent observations z 1,z 2,...,z N 2 R n (n<<n) the main goal of PCA is to reduce the dimensionality of a data set while retaining as much as possible of the variation present in the data set.

8 PRINCIPAL COMPONENT ANALYSIS (PCA) Suppose that z 2 R n is a vector of random variables, and that the variances of the n random variables and the structure of the covariances between the n variables are of interest. Given N independent observations z 1,z 2,...,z N 2 R n (n<<n) the main goal of PCA is to reduce the dimensionality of a data set while retaining as much as possible of the variation present in the data set. This is achieved by transforming to a new set of variables (m <n), the principal components (PCs), which are uncorrelated, and which are ordered so that the first few retain most of the variation present in all of the original variables.

9 PRINCIPAL COMPONENT ANALYSIS (PCA) Suppose that z 2 R n is a vector of random variables, and that the variances of the n random variables and the structure of the covariances between the n variables are of interest. Given N independent observations z 1,z 2,...,z N 2 R n (n<<n) the main goal of PCA is to reduce the dimensionality of a data set while retaining as much as possible of the variation present in the data set. This is achieved by transforming to a new set of variables (m <n), the principal components (PCs), which are uncorrelated, and which are ordered so that the first few retain most of the variation present in all of the original variables.

10 A COUPLE OF EXAMPLES Certain analysis considered the grades of N = 15 students in n =8 subjects. The first two PCs account for 82,1% of the total variation in the data set. The first one was strongly correlated with humanity subjects and the second one with science subjects.

11 A COUPLE OF EXAMPLES Certain analysis considered the grades of N = 15 students in n =8 subjects. The first two PCs account for 82,1% of the total variation in the data set. The first one was strongly correlated with humanity subjects and the second one with science subjects. The analysis of n = 11 socio-economic indicators of N = 96 countries revealed that all the information could be explained by two only PCs. The first one was related with the gross domestic product of the country while the second was the rural condition.

A COUPLE OF EXAMPLES Certain analysis considered the grades of N = 15 students in n =8 subjects. The first two PCs account for 82,1% of the total variation in the data set.

12 A COUPLE OF EXAMPLES Certain analysis considered the grades of N = 15 students in n =8 subjects. The first two PCs account for 82,1% of the total variation in the data set. The first one was strongly correlated with humanity subjects and the second one with science subjects. The analysis of n = 11 socio-economic indicators of N = 96 countries revealed that all the information could be explained by two only PCs. The first one was related with the gross domestic product of the country while the second was the rural condition. 1901

13 HOW TO OBTAIN THE PC S Singular value decomposition Given a data set z 1,z 2,...,z N 2 R n (already subtracted the mean value), the first m PCs are given by x j = Q 0 xz j where Q x 2 R n m has orthogonal columns such that the predictive uncertainty NX j=1 kz j Q x x j k 2 is minimal.

14 HOW TO OBTAIN THE PC S Singular value decomposition Given a data set z 1,z 2,...,z N 2 R n (already subtracted the mean value), the first m PCs are given by x j = Q 0 xz j where Q x 2 R n m has orthogonal columns such that the predictive uncertainty NX j=1 kz j Q x x j k 2 is minimal. The matrix Q x consists of the first m columns of U in the singular value decomposition Z 0 = USV 0 where U 2 R n n and V 2 R N N are orthogonal matrices and S 2 R n N is diagonal with the eigenvalues of the covariance matrix Z 0 Z sorted in decreasing order (Z =[z 1 z N ]).

15 AUTORREGRESIVE MODELS (AR(P)) One dimensional AR(p) For a random process z the AR(p) model is defined as z j = b + px i=1 a i z j i + " j, where a i,...,a p are the parameters of the model, b is a constant, and " j is white noise. The process is stationary if the roots of the polynomial x p P p i=1 a ix p i lie within the unit circle.

The process is stationary if the roots of the polynomial x p P p i=1 a ix p i lie within the unit circle. a i,.

16 AUTORREGRESIVE MODELS (AR(P)) One dimensional AR(p) For a random process z the AR(p) model is defined as z j = b + px i=1 a i z j i + " j, where a i,...,a p are the parameters of the model, b is a constant, and " j is white noise. The process is stationary if the roots of the polynomial x p P p i=1 a ix p i lie within the unit circle. a i,...,a p can be estimated by solving the Yule-Walker equations where p 1 C A = p 1 p 2 p C B m = E[z j+m z j ] are the covariance functions. a 1 a 2. a p 1 C A, 1931

17 SOME EXAMPLES OF AR(1) AND AR(2)

18 VECTOR AR(P) For a vector-valued random process z 2 R n the AR(p) model is defined as z j = b + px i=1 A i z j i + " j, where A i,...,a p 2 R n n are the parameters of the model, b is a constant vector, and " j is a vector of white noise.

19 VECTOR AR(P) For a vector-valued random process z 2 R n the AR(p) model is defined as z j = b + px i=1 A i z j i + " j, where A i,...,a p 2 R n n are the parameters of the model, b is a constant vector, and " j is a vector of white noise. Now the process is stationary if the roots of x p I n px i=1 A i x p i =0 lie within the unit circle.

20 VECTOR AR(P) For a vector-valued random process z 2 R n the AR(p) model is defined as z j = b + px i=1 A i z j i + " j, where A i,...,a p 2 R n n are the parameters of the model, b is a constant vector, and " j is a vector of white noise. Now the process is stationary if the roots of x p I n px i=1 A i x p i =0 lie within the unit circle. Similarly an estimation of the parameters A i,...,a p can be calculated solving the Yule-Walker equations (now block matrices).

21 PRINCIPAL DYNAMICAL COMPONENTS (PDC) Goal: Mix PCA and AR(p).

22 PRINCIPAL DYNAMICAL COMPONENTS (PDC) Goal: Mix PCA and AR(p). Given a time series z j 2 R n with transition probability T (z j+1 z j ) (Markovian), PDC considers a dimensional reduction of T (z j+1 z j ) in the following way where T (z j+1 z j )=J(z j+1 )e(y j+1 x j+1 )d(x j+1 x j ), x = P x (z(x, y)) 2 R m,y = P y (z(x, y)) 2 R n m (P x and P y are projection operators) and J(z) is the Jacobian determinant of the coordinate map z! (x, y). e(y x) is a probabilistic embedding. d(x j+1 x j )isareduceddynamical model.

23 PRINCIPAL DYNAMICAL COMPONENTS (PDC) We will focus on the case where P x and P y are orthogonal projections (therefore J(z) = 1) and the embedding and reduced dynamics are given by isotropic Gaussians e(y x) =N (0, 2 I n m ), d(x j+1 x j )=N (Ax j, 2 I m ).

PRINCIPAL DYNAMICAL COMPONENTS (PDC) We will focus on the case where P x and P y are orthogonal projections (therefore J(z) = 1) and the embedding and reduced dynamics are given by isotropic

24 PRINCIPAL DYNAMICAL COMPONENTS (PDC) We will focus on the case where P x and P y are orthogonal projections (therefore J(z) = 1) and the embedding and reduced dynamics are given by isotropic Gaussians e(y x) =N (0, 2 I n m ), d(x j+1 x j )=N (Ax j, 2 I m ). Therefore, the log-likelihood function is given by L = NX 1 j=1 apple n 2 log(2 )+n log( ) kx j+1 Ax j k 2 + ky j+1 k 2. Maximazing L over P and A is equivalent to minimizing the cost function c = 1 N 1 NX 1 j=1 kx j+1 Ax j k 2 + ky j+1 k 2.

25 LINEAR AND AUTONOMOUS CASE (MARKOVIAN) For a time series z 2 R n we look for a m-dimensional submanifold x = Q 0 xz such that x z =[Q x Q y ] y

26 LINEAR AND AUTONOMOUS CASE (MARKOVIAN) For a time series z 2 R n we look for a m-dimensional submanifold x = Q 0 xz such that x z =[Q x Q y ] y At the same time we look for dynamics in the submanifold of the form x j+1 = Ax j.

27 LINEAR AND AUTONOMOUS CASE (MARKOVIAN) For a time series z 2 R n we look for a m-dimensional submanifold x = Q 0 xz such that x z =[Q x Q y ] y At the same time we look for dynamics in the submanifold of the form x j+1 = Ax j. The minimization problem that defines Q =[Q x Q y ] and A is N min c = X1 Q,A j=1 z j+1 Q AQx 0 z j 0 2 NX 1 = j=1 xj+1 y j+1 Ax j 2.

28 PDC SIMPLEST CASE n =2 Let us call z = A P.We look for a one-dimensional submanifold x of z x = A cos( )+P sin( ) y = A sin( )+P cos( )

29 PDC SIMPLEST CASE n =2 Let us call z = A P.We look for a one-dimensional submanifold x of z x = A cos( )+P sin( ) y = A sin( )+P cos( ) Additionally we look for a dynamic in x x j+1 = ax j

30 PDC SIMPLEST CASE n =2 Let us call z = A P.We look for a one-dimensional submanifold x of z x = A cos( )+P sin( ) y = A sin( )+P cos( ) Additionally we look for a dynamic in x The cost function is then x j+1 = ax j c(, a) = = NX 1 j=1 NX 1 j=1 Aj+1 Ã j+1 P j+1 xj+1 y j+1 Pj+1 ax j 2 NX 1 = j=1 2 NX 1 = j=1 xj+1 x j+1 y j+1 ỹ j+1 2 (y j+1 ) 2 +(x j+1 ax j ) 2

31 PDC SIMPLEST CASE n =2 Let us call z = A P.We look for a one-dimensional submanifold x of z x = A cos( )+P sin( ) y = A sin( )+P cos( ) Additionally we look for a dynamic in x The cost function is then x j+1 = ax j c(, a) = = NX 1 j=1 NX 1 j=1 Aj+1 Ã j+1 P j+1 xj+1 y j+1 Pj+1 ax j 2 NX 1 = j=1 2 NX 1 = j=1 xj+1 x j+1 y j+1 ỹ j+1 2 (y j+1 ) 2 +(x j+1 ax j ) 2 By contrast, the corresponding cost function for regular principal components in this 2-dimensional scenario is c PCA ( ) = NX y 2 j. j=1

32 SYNTHETIC EXAMPLE We created data from the dynamical model x j+1 = ax j +0.3 x j y j+1 = 0.6 y j where a =0.6, j =1,...,999 and x,y j from a normal distribution. are independent samples

33 SYNTHETIC EXAMPLE We created data from the dynamical model x j+1 = ax j +0.3 x j y j+1 = 0.6 y j where a =0.6, j =1,...,999 and x,y j from a normal distribution. are independent samples Then we rotated the data through the angle = 3 A j = x j cos( ) y j sin( ) P j = x j sin( )+y j cos( ) and we perform descent over the variables a and.

34 SYNTHETIC EXAMPLE P 0 a A step θ c step step

35 NONAUTONOMOUS PROBLEMS

36 NONAUTONOMOUS PROBLEMS Take n =2,m= 1. We created data from the dynamical model x j+1 = a j x j + b j +0.3 x j, y j+1 = ȳ j y j, for j =1,...,999 and we adopted the values a j dynamics, b j = 1 2 sin 2 tj T for the drift, and ȳ j non-zero mean of y, wheret j = j and T = 12. = 6 5 cos2 2 tj T = 2 5 cos 2 tj T for the for the

37 NONAUTONOMOUS PROBLEMS Take n =2,m= 1. We created data from the dynamical model x j+1 = a j x j + b j +0.3 x j, y j+1 = ȳ j y j, for j =1,...,999 and we adopted the values a j dynamics, b j = 1 2 sin 2 tj T for the drift, and ȳ j non-zero mean of y, wheret j = j and T = 12. = 6 5 cos2 2 tj T = 2 5 cos 2 tj T for the for the Then we rotated the data through where j = 6 sin 2 tj T A j = x j cos( j ) y j sin( j ), P j = x j sin( j )+y j cos( j ),.

38 NONAUTONOMOUS PROBLEMS P 0 a(t) A t j b(t) 0 θ(t) t j 0.8 t j \bar{y}(t) 0 c t j step

39 HIGHER-ORDER PROCESSES

40 HIGHER-ORDER PROCESSES Take n = 2, m = 1, and r = 3, the order of the Non-Markovian process. We created data from the dynamical model x j+1 = a 1 x j + a 2 x j 1 + a 3 x j x j, y j+1 = 0.6 y j, for j =3,...,999 and we adopted the values a 1 =0.4979,a 2 = ,a 3 = for the dynamics

41 HIGHER-ORDER PROCESSES Take n = 2, m = 1, and r = 3, the order of the Non-Markovian process. We created data from the dynamical model x j+1 = a 1 x j + a 2 x j 1 + a 3 x j x j, y j+1 = 0.6 y j, for j =3,...,999 and we adopted the values a 1 =0.4979,a 2 = ,a 3 = for the dynamics Then, as before, we define A j = x j cos( ) y j sin( ), P j = x j sin( )+y j cos( ), with = 3, and provide the A j and P j as data for the principal dynamical component routine.

42 HIGHER-ORDER PROCESSES P A a step a a step step θ 0.5 c step step

43 APPLICATION TO THE GLOBAL SEA-SURFACE TEMPERATURE FIELD

44 APPLICATION TO THE GLOBAL SEA-SURFACE TEMPERATURE FIELD Preliminary considerations Database: monthly averaged extended reconstructed global sea surface temperatures based on COADS data (January 1854 to October 2009).

45 APPLICATION TO THE GLOBAL SEA-SURFACE TEMPERATURE FIELD Preliminary considerations Database: monthly averaged extended reconstructed global sea surface temperatures based on COADS data (January 1854 to October 2009). The ocean is not an isolated player in climate dynamics: it interacts with the atmosphere and the continents, and is also a ected by external conditions, like solar radiation or human-related release of CO 2 into the atmosphere. The latter are examples of slowly varying external trends that fit naturally into our non-autonomous setting.

46 APPLICATION TO THE GLOBAL SEA-SURFACE TEMPERATURE FIELD Preliminary considerations Database: monthly averaged extended reconstructed global sea surface temperatures based on COADS data (January 1854 to October 2009). The ocean is not an isolated player in climate dynamics: it interacts with the atmosphere and the continents, and is also a ected by external conditions, like solar radiation or human-related release of CO 2 into the atmosphere. The latter are examples of slowly varying external trends that fit naturally into our non-autonomous setting. Even within the ocean, the surface temperature does not evolve alone: it is carried by currents, and it interacts through mixing with lower layers of the ocean. One way to account for unobserved variables is to make the model non-markovian.

47 50 POINTS IN THE OCEAN 90N 70N 50N 30N N S S S S 90S 0 40E 80E 120E 160E 160W 120W 80W 40W 0 ) n = 50 and N = 1858

48 CHOICE OF DIMENSION AND ORDER

49 CHOICE OF DIMENSION AND ORDER Order of the process: r =3

50 CHOICE OF DIMENSION AND ORDER Order of the process: r =3

51 CHOICE OF DIMENSION AND ORDER Predictive uncertainty as a function of the dimension m of the reduced dynamical manifold and the order r of the process final error final error order non Markovian process dimension m of the manifold

52 CHOICE OF DIMENSION AND ORDER Predictive uncertainty as a function of the dimension m of the reduced dynamical manifold and the order r of the process final error final error order non Markovian process dimension m of the manifold Therefore, we pick r = 3 and m = 4.

53 REAL AND PREDICTED DYNAMICAL COMPONENTS 8 6 First Second Third Fourth 4 x vs xpred t

54 POINTS OF SIGNIFICANCE: 19, 24, 37, 41 First Second Third Fourth Significance Points in the ocean

55 50 POINTS IN THE OCEAN 90N 70N 50N 30N N S S S S 90S 0 40E 80E 120E 160E 160W 120W 80W 40W 0

56 50 POINTS IN THE OCEAN 90N 70N 50N 30N N S S S S 90S 0 40E 80E 120E 160E 160W 120W 80W 40W 0

57 OBSERVED AND PREDICTED TEMPERATURES temperatures t

58 ANOMALIES We consider three-month mean SST anomaly in the following El Niño region: 90N 70N 50N 30N N S S S S 90S 0 40E 80E 120E 160E 160W 120W 80W 40W 0

59 ANOMALIES Anomalies t

60 CONCLUSIONS This new methodology allows a dimensional reduction of time series seeking a low-dimensional manifold x and a dynamical model x j+1 = D(x j,x j 1,...,t) that minimize the predictive uncertainty of the series. We have tested on synthetic data and with a real application on sea-surface temperature over the ocean.

61 CONCLUSIONS This new methodology allows a dimensional reduction of time series seeking a low-dimensional manifold x and a dynamical model x j+1 = D(x j,x j 1,...,t) that minimize the predictive uncertainty of the series. We have tested on synthetic data and with a real application on sea-surface temperature over the ocean. It would be interesting to apply this new methodology to other types of data, like economic, social, etc.

62 CONCLUSIONS This new methodology allows a dimensional reduction of time series seeking a low-dimensional manifold x and a dynamical model x j+1 = D(x j,x j 1,...,t) that minimize the predictive uncertainty of the series. We have tested on synthetic data and with a real application on sea-surface temperature over the ocean. It would be interesting to apply this new methodology to other types of data, like economic, social, etc. An interesting question is future prediction. For the case of sea-surface temperatures, after 3 months the prediction is not good. This happens because there are many variables inside that may modify significantly the model. Nevertheless, the prediction could work for other models where there is not so much volatility in the data.

63 CONCLUSIONS This new methodology allows a dimensional reduction of time series seeking a low-dimensional manifold x and a dynamical model x j+1 = D(x j,x j 1,...,t) that minimize the predictive uncertainty of the series. We have tested on synthetic data and with a real application on sea-surface temperature over the ocean. It would be interesting to apply this new methodology to other types of data, like economic, social, etc. An interesting question is future prediction. For the case of sea-surface temperatures, after 3 months the prediction is not good. This happens because there are many variables inside that may modify significantly the model. Nevertheless, the prediction could work for other models where there is not so much volatility in the data. M. D. de la Iglesia and E. G. Tabak, Principal dynamical components, Comm. Pure Appl. Math. 66 (2013), no. 1, 48 82

65 THANK YOU

Principal Component Analysis of Sea Surface Temperature via Singular Value Decomposition

Principal Component Analysis of Sea Surface Temperature via Singular Value Decomposition SYDE 312 Final Project Ziyad Mir, 20333385 Jennifer Blight, 20347163 Faculty of Engineering Department of Systems