Chap.11 Nonlinear principal component analysis [Book, Chap. 10]

Similar documents
Nonlinear principal component analysis of noisy data

Nonlinear multivariate and time series analysis by neural network methods

Nonlinear singular spectrum analysis by neural networks. William W. Hsieh and Aiming Wu. Oceanography/EOS, University of British Columbia,

Nonlinear principal component analysis by neural networks

e 2 e 1 (a) (b) (d) (c)

Hybrid coupled modeling of the tropical Pacific using neural networks

A Nonlinear Analysis of the ENSO Cycle and Its Interdecadal Changes

Complex-Valued Neural Networks for Nonlinear Complex Principal Component Analysis

Ch.3 Canonical correlation analysis (CCA) [Book, Sect. 2.4]

Unsupervised dimensionality reduction

Nonlinear atmospheric teleconnections

Fourier Analysis. 19th October 2015

Nonlinear Dimensionality Reduction

the 2 past three decades

Kernel-Based Principal Component Analysis (KPCA) and Its Applications. Nonlinear PCA

Validation of nonlinear PCA

Atmospheric QBO and ENSO indices with high vertical resolution from GNSS RO

Non-linear Dimensionality Reduction

A cautionary note on the use of Nonlinear Principal Component Analysis to identify circulation regimes.

CSE 291. Assignment Spectral clustering versus k-means. Out: Wed May 23 Due: Wed Jun 13

Sensitivity of Tropical Tropospheric Temperature to Sea Surface Temperature Forcing

Rainfall is the most important climate element affecting the livelihood and wellbeing of the

THE NONLINEAR ENSO MODE AND ITS INTERDECADAL CHANGES

Interannual Teleconnection between Ural-Siberian Blocking and the East Asian Winter Monsoon

ECE 521. Lecture 11 (not on midterm material) 13 February K-means clustering, Dimensionality reduction

NON-LINEAR CYCLIC REGIMES OF SHORT-TERM CLIMATE VARIABILITY. PERRY SIH B.Sc, The University of British Columbia, 2001

JP1.7 A NEAR-ANNUAL COUPLED OCEAN-ATMOSPHERE MODE IN THE EQUATORIAL PACIFIC OCEAN

The nonlinear association between ENSO and the Euro-Atlantic winter sea level pressure

Problems with EOF (unrotated)

ENSO Cycle: Recent Evolution, Current Status and Predictions. Update prepared by Climate Prediction Center / NCEP 15 July 2013

Reduction of complex models using data-mining and nonlinear projection techniques

Robust nonlinear canonical correlation analysis: Application to seasonal climate forecasting

Kernel and Nonlinear Canonical Correlation Analysis

STA 414/2104: Lecture 8

ENSO Cycle: Recent Evolution, Current Status and Predictions. Update prepared by Climate Prediction Center / NCEP 23 April 2012

ENSO Cycle: Recent Evolution, Current Status and Predictions. Update prepared by Climate Prediction Center / NCEP 5 August 2013

ENSO Cycle: Recent Evolution, Current Status and Predictions. Update prepared by Climate Prediction Center / NCEP 25 February 2013

ENSO: Recent Evolution, Current Status and Predictions. Update prepared by: Climate Prediction Center / NCEP 9 November 2015

ENSO Cycle: Recent Evolution, Current Status and Predictions. Update prepared by Climate Prediction Center / NCEP 11 November 2013

8.6 Bayesian neural networks (BNN) [Book, Sect. 6.7]

Semiblind Source Separation of Climate Data Detects El Niño as the Component with the Highest Interannual Variability

ENSO Cycle: Recent Evolution, Current Status and Predictions. Update prepared by Climate Prediction Center / NCEP 24 September 2012

ENSO: Recent Evolution, Current Status and Predictions. Update prepared by: Climate Prediction Center / NCEP 30 October 2017

SUPPLEMENTARY INFORMATION

1. Introduction. 2. Verification of the 2010 forecasts. Research Brief 2011/ February 2011

The Forcing of the Pacific Decadal Oscillation

ENSO Cycle: Recent Evolution, Current Status and Predictions. Update prepared by Climate Prediction Center / NCEP July 26, 2004

Characteristics of Storm Tracks in JMA s Seasonal Forecast Model

Received 3 October 2001 Revised 20 May 2002 Accepted 23 May 2002

Assessment of the Impact of El Niño-Southern Oscillation (ENSO) Events on Rainfall Amount in South-Western Nigeria

ENSO Outlook by JMA. Hiroyuki Sugimoto. El Niño Monitoring and Prediction Group Climate Prediction Division Japan Meteorological Agency

Nonlinear Dimensionality Reduction. Jose A. Costa

An Introduction to Nonlinear Principal Component Analysis

ATMOSPHERIC MODELLING. GEOG/ENST 3331 Lecture 9 Ahrens: Chapter 13; A&B: Chapters 12 and 13

Will a warmer world change Queensland s rainfall?

SUPPLEMENTARY INFORMATION

Separation of a Signal of Interest from a Seasonal Effect in Geophysical Data: I. El Niño/La Niña Phenomenon

Apprentissage non supervisée

Quasi-Biennial Oscillation Modes Appearing in the Tropical Sea Water Temperature and 700mb Zonal Wind* By Ryuichi Kawamura

High initial time sensitivity of medium range forecasting observed for a stratospheric sudden warming

Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations.

Change in Occurrence Frequency of Stratospheric Sudden Warmings. with ENSO-like SST Forcing as Simulated WACCM

The Numerical Stability of Kernel Methods

HEIGHT-LATITUDE STRUCTURE OF PLANETARY WAVES IN THE STRATOSPHERE AND TROPOSPHERE. V. Guryanov, A. Fahrutdinova, S. Yurtaeva

ENSO Irregularity. The detailed character of this can be seen in a Hovmoller diagram of SST and zonal windstress anomalies as seen in Figure 1.

Inter-comparison of Historical Sea Surface Temperature Datasets. Yasunaka, Sayaka (CCSR, Univ. of Tokyo, Japan) Kimio Hanawa (Tohoku Univ.

Climate Forecast Applications Network (CFAN)

The Formation of Precipitation Anomaly Patterns during the Developing and Decaying Phases of ENSO

2. DYNAMICS OF CLIMATIC AND GEOPHYSICAL INDICES

Charles Jones ICESS University of California, Santa Barbara CA Outline

lecture 10 El Niño and the Southern Oscillation (ENSO) Part I sea surface height anomalies as measured by satellite altimetry

Contents of this file

THE INFLUENCE OF CLIMATE TELECONNECTIONS ON WINTER TEMPERATURES IN WESTERN NEW YORK INTRODUCTION

lecture 11 El Niño/Southern Oscillation (ENSO) Part II

ENSO Interdecadal Modulation in CCSM4: A Linear Inverse Modeling Approach

Introduction of Seasonal Forecast Guidance. TCC Training Seminar on Seasonal Prediction Products November 2013

Climate Variability. Andy Hoell - Earth and Environmental Systems II 13 April 2011

Simple Mathematical, Dynamical Stochastic Models Capturing the Observed Diversity of the El Niño Southern Oscillation (ENSO)

Supplementary information of Weather types across the Caribbean basin and their relationship with rainfall and sea surface temperature

The Quasi-Biennial Oscillation Analysis of the Resolved Wave Forcing

Development of a Coupled Atmosphere-Ocean-Land General Circulation Model (GCM) at the Frontier Research Center for Global Change

Part I Generalized Principal Component Analysis

Introduction of climate monitoring and analysis products for one-month forecast

NOTES AND CORRESPONDENCE. On the Seasonality of the Hadley Cell

Application of Clustering to Earth Science Data: Progress and Challenges

The Planetary Circulation System

MC-KPP: Efficient, flexible and accurate air-sea coupling

Interactions Between the Stratosphere and Troposphere

Lecture #3: Gravity Waves in GCMs. Charles McLandress (Banff Summer School 7-13 May 2005)

ARTIFICIAL NEURAL NETWORKS گروه مطالعاتي 17 بهار 92

WILL CLIMATE CHANGE AFFECT CONTRAIL OCCURRENCE? JAKE GRISTEY UNIVERSITY OF READING

An Introduction to Coupled Models of the Atmosphere Ocean System

STA 414/2104: Lecture 8

On North Pacific Climate Variability. M. Latif. Max-Planck-Institut für Meteorologie. Bundesstraße 55, D Hamburg, Germany.

Principal Component Analysis!! Lecture 11!

7 The Quasi-Biennial Oscillation (QBO)

Dimensionality Reduction

Francina Dominguez*, Praveen Kumar Department of Civil and Environmental Engineering University of Illinois at Urbana-Champaign

Challenges for Climate Science in the Arctic. Ralf Döscher Rossby Centre, SMHI, Sweden

Dynamics of the Extratropical Response to Tropical Heating

Transcription:

Chap.11 Nonlinear principal component analysis [Book, Chap. 1] We have seen machine learning methods nonlinearly generalizing the linear regression method. Now we will examine ways to nonlinearly generalize principal component analysis (PCA). Fig. below illustrates the difference between (a) linear regression, (b) PCA, (c) nonlinear regression, and (d) nonlinear PCA. 1 / 23

Figure : (a) The linear regression line minimizes the mean squared error (MSE) in the response variable. The dashed line illustrates the dramatically different result when the role of the predictor variable and the response variable are reversed. (b) PCA minimizes the MSE in all variables. (c) Nonlinear regression methods produce a curve minimizing 2 / 23

the MSE in the response variable. (d) Nonlinear PCA methods use a curve which minimize the MSE of all variables. In both (c) and (d), the smoothness of the curve can be varied by the method. [Reproduced from Hastie and Stuetzle (1989)]. Nonlinear PCA can be performed by a variety of methods, e.g. the auto-associative NN model using multi-layer perceptrons (MLP) (Kramer, 1991; Hsieh, 21, 27), and the kernel PCA model (Schölkopf et al., 1998). Nonlinear PCA belongs to the class of nonlinear dimensionality reduction techniques, which also includes principal curves (Hastie and Stuetzle, 1989), locally linear embedding (LLE) (Roweis and Saul, 2) and isomap (Tenenbaum et al., 2). 3 / 23

Self-organizing map (SOM) (Kohonen, 1982) can also be regarded as a discrete version of NLPCA. 11.1 Auto-associative neural networks for nonlinear PCA [Book, Sect.1.1] Open curves [Book, Sect.1.1.1] Kramer (1991) proposed a neural-network based nonlinear PCA (NLPCA) model where the straight line solution in PCA is replaced by a continuous open curve for approximating the data. The fundamental difference between NLPCA and PCA is that PCA only allows a linear mapping (u = e x) between x and the PC u, while NLPCA allows a nonlinear mapping. 4 / 23

Figure : A schematic diagram of the NN model for calculating the NLPCA. There are 3 layers of hidden neurons sandwiched between the input layer x on the left and the output layer x on the right. Next to the input layer is the encoding layer, followed by the bottleneck layer (with a single neuron u), which is then followed by the decoding layer. A nonlinear function maps from the higher dimension input space to the 1-dimension bottleneck space, followed by an inverse transform mapping from the bottleneck space back to the original space represented by the outputs, which are to be as close to the inputs as possible by minimizing the objective function J = x x 2, where... denotes calculating 5 / 23

the average over all the data points. Data compression is achieved by the bottleneck, with the bottleneck neuron giving u, the nonlinear principal component (NLPC). One can view the NLPCA network as composed of two standard 2-layer MLP NNs placed one after the other. The first 2-layer network maps from the inputs x through a hidden layer to the bottleneck layer, here with only one neuron u, i.e. a nonlinear mapping u = f (x). The next 2-layer MLP NN inversely maps from the nonlinear PC (NLPC) u back to the original higher dimensional x-space, with the objective that the outputs x = g(u) be as close as possible to the inputs x, where g(u) nonlinearly generates a curve in the x-space, hence a 1-dimensional approximation of the original data. 6 / 23

Because the target data for the output neurons x are simply the input data x, such networks are called auto-associative NNs. Squeezing the input information through a bottleneck layer (here with only one neuron) accomplishes the dimensional reduction. NLPCA of sea surface temperature anomalies [Book, Sect. 1.1.2] The tropical Pacific SST anomaly (SSTA) data (195-1999) (i.e. the SST data with the climatological seasonal cycle removed) were pre-filtered by PCA, with only the 3 leading modes retained (Hsieh, 21). PCA modes 1, 2 and 3 accounted for 51.4%, 1.1% and 7.2%, respectively, of the variance in the SSTA data. 7 / 23

Due to the large number of spatially gridded variables, NLPCA could not be applied directly to the SSTA time series, as this would lead to a huge NN with the number of model parameters vastly exceeding the number of observations. Instead, the first 3 PCs (PC1, PC2 and PC3) were used as the input x for the NLPCA network. 8 / 23

25 2 15 1 PC2 5 5 1 15 2 6 4 2 2 4 6 8 PC1 Figure : Scatter plot of the SST anomaly (SSTA) data (shown as dots) in the PC1-PC2 plane, with the El Niño states lying in the upper right corner, and the La Niña states in the upper left corner. The PC2 axis is stretched relative to the PC1 axis for better visualization. The first mode NLPCA approximation to the data is shown by the (overlapping) small 9 / 23

circles, which traced out a U-shaped curve. The first PCA eigenvector is oriented along the horizontal line, and the second PCA, by the vertical line. The varimax method rotates the two PCA eigenvectors in a counterclockwise direction, as the rotated PCA (RPCA) eigenvectors are oriented along the dashed lines. In terms of variance explained, the first NLPCA mode explained 56.6% of the variance, versus 51.4% by the first PCA mode, and 47.2% by the first RPCA mode. With the NLPCA, for a given value of the NLPC u, one can map from u to the 3 PCs. This is done by assigning the value u to the bottleneck neuron and mapping forward using the second half of the network. Each of the 3 PCs can be multiplied by its associated PCA 1 / 23

(spatial) eigenvector, and the three added together to yield the spatial pattern for that particular value of u. Unlike PCA which gives the same spatial anomaly pattern except for changes in the amplitude as the PC varies, the NLPCA spatial pattern generally varies continuously as the NLPC changes. 11 / 23

.5 1 1.5.5 2N (a) PCA mode 1.5 1 1.5 2 2N (b) PCA mode 2 1 1N 1S 2S 1.5.5 3 1.5.5 3 2 2 2.5 2.5 1.5 3.5.5 1 1N 1S 2S 1.5.5 1 1.52 15E 18E 15W 12W 9W (c) RPCA mode 1 15E 18E 15W 12W 9W (d) RPCA mode 2 2N.5 2 2N 1 1N 1.5 1.5 3 22 2.5 3.544.55 1N.5.5.5 1 1S 2 2.5 1S 2S.5.5 1.5 1 2S.5 15E 18E 15W 12W 9W 15E 18E 15W 12W 9W (e) max(u) NLPCA (f) min(u) NLPCA 2N 2N 1 1N 1S 2S.5.5.5 1 3 1.5 2.5 3.5 1 2 1.5 4 2.5 1N 1S 2S.5.5 1 1.5.5 1.5 1 1 15E 18E 15W 12W 9W 15E 18E 15W 12W 9W 12 / 23

Figure : The SSTA patterns (in C) of the PCA, RPCA and the NLPCA. The first and second PCA spatial modes are shown in (a) and (b) respectively, (both with their corresponding PCs at maximum value). The first and second varimax RPCA spatial modes are shown in (c) and (d) respectively, (both with their corresponding RPCs at maximum value). The anomaly pattern as the NLPC u of the first NLPCA mode varies from (e) maximum (strong El Niño) to (f) its minimum (strong La Niña). With a contour interval of.5 C, the positive contours are shown as solid curves, negative contours, dashed curves, and the zero contour, a thick curve. 13 / 23

Clearly the asymmetry between El Niño and La Niña, i.e. the cool anomalies during La Niña episodes (Fig. (f)) are observed to centre much further west of the warm anomalies during El Niño (Fig. (e)), is well captured by the first NLPCA mode. With a linear approach, it is generally impossible to have a solution simultaneously (a) explaining maximum global variance of the dataset and (b) approaching local data clusters, hence the dichotomy between PCA and RPCA, with PCA aiming for (a) and RPCA for (b). With the more flexible NLPCA method, both objectives (a) and (b) may be attained together, thus the nonlinearity in NLPCA unifies the PCA and RPCA approaches. 14 / 23

The tropical Pacific SST example illustrates that with a complicated oscillation like the El Niño-La Niña phenomenon, using a linear method such as PCA results in the nonlinear mode being scattered into several linear modes (in fact, all 3 leading PCA modes are related to this phenomenon). Closed curves [Book, Sect. 1.1.4] Many phenomena involving waves or quasi-periodic fluctuations call for a continuous closed curve solution for NLPCA. Kirby and Miranda (1996) introduced an NLPCA with a circular node at the network bottleneck [henceforth referred to as the NLPCA(cir)], so that the nonlinear principal component (NLPC) as represented by the circular node is an angular variable θ, and the 15 / 23

NLPCA(cir) is capable of approximating the data by a closed continuous curve. Figure : Schematic diagram of the NN model used for NLPCA with a circular node at the bottleneck (NLPCA(cir)). Instead of having one bottleneck neuron u, there are two neurons p and q constrained to lie on a unit circle in the p-q plane, so there is only one free angular variable θ, the NLPC. This network is for extracting a closed curve solution. 16 / 23

For an application of NLPCA(cir), consider the Quasi-Biennial Oscillation (QBO), which dominates over the annual cycle or other variations in the equatorial stratosphere, with the period of oscillation varying roughly between 22 and 32 months. Average zonal (i.e. the westerly component of the) winds at 7, 5, 4, 3, 2, 15 and 1 hpa (i.e. from about 2 km to 3 km altitude) during 1956-26 were studied. After the 51-year means were removed, the zonal wind anomalies U at 7 vertical levels in the stratosphere became the 7 inputs to the NLPCA(cir) network. 17 / 23

(a) 4 (b) U(7 hpa) 1 1 2 2 U(3 hpa) 2 2 2 U(1 hpa) (c) U(3 hpa) 2 2 4 4 2 2 4 U(1 hpa) 2 (d) U(7 hpa) 1 1 U(7 hpa) 1 1 2 4 2 2 4 U(1 hpa) 2 4 2 2 4 U(3 hpa) Figure : The NLPCA(cir) mode 1 solution for the equatorial stratospheric zonal wind anomalies. For comparison, the PCA mode 1 solution is shown by the dashed line. Only 3 out of 7 dimensions are shown, namely 18 / 23

the zonal velocity anomaly U at the top, middle and bottom levels (1, 3 and 7 hpa). Panel (a) gives a 3-D view, while (b)-(d) give 2-D views. 1 15 pressure (hpa) 2 3 4 5 7 1.5.5 1 (in radians) weighted Figure : Contour plot of the NLPCA(cir) mode 1 zonal wind anomalies as a function of pressure and phase θ weighted, where θ weighted is θ 19 / 23

weighted by the histogram distribution of θ (see Hamilton and Hsieh, 22). Thus θ weighted is more representative of actual time during a cycle than θ. Contour interval is 5 ms 1, with westerly winds indicated by solid lines, easterlies by dashed lines, and zero contours by thick lines. As the easterly wind anomaly descends with time (i.e. as phase increases), wavy behaviour is seen in the 4, 5 and 7 hpa levels at θ weighted around.4-.5. Nonlinear singular spectrum analysis (NLSSA) has also been developed based on the NLPCA(cir) model (Hsieh and Wu, 22) (Book, Sect.1.6). Also nonlinear canonical correlation analysis (NLCCA) by NN (Cannon and Hsieh, 28) (Book, Chap.11) 2 / 23

Functions in Matlab: NLPCA and NLCCA codes at: www.ocgy.ubc.ca/projects/clim.pred/download.html References: Cannon, A. J. and Hsieh, W. W. (28). Robust nonlinear canonical correlation analysis: application to seasonal climate forecasting. Nonlinear Processes in Geophysics, 12:221 232. Hamilton, K. and Hsieh, W. W. (22). Representation of the QBO in the tropical stratospheric wind by nonlinear principal component analysis. Journal of Geophysical Research, 17(D15). 4232, DOI: 1.129/21JD125. Hastie, T. and Stuetzle, W. (1989). Principal curves. Journal of the American Statistical Association, 84:52 516. 21 / 23

Hsieh, W. W. (21). Nonlinear principal component analysis by neural networks. Tellus, 53A:599 615. Hsieh, W. W. (27). Nonlinear principal component analysis of noisy data. Neural Networks, 2:434 443. Hsieh, W. W. and Wu, A. (22). Nonlinear multichannel singular spectrum analysis of the tropical Pacific climate variability using a neural network approach. Journal of Geophysical Research, 17(C7). DOI: 1.129/21JC957. Kirby, M. J. and Miranda, R. (1996). Circular nodes in neural networks. Neural Computation, 8:39 42. Kohonen, T. (1982). Self-organzing formation of topologically correct feature maps. Biological Cybernetics, 43:59 69. Kramer, M. A. (1991). Nonlinear principal component analysis using autoassociative neural networks. AIChE Journal, 37:233 243. 22 / 23

Roweis, S. T. and Saul, L. K. (2). Nonlinear dimensionality reduction by locally linear embedding. Science, 29:2323 2326. Schölkopf, B., Smola, A., and Muller, K.-R. (1998). Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation, 1:1299 1319. Tenenbaum, J. B., de Silva, V., and Langford, J. C. (2). A global geometric framework for nonlinear dimensionality reduction. Science, 29:2319 2323. 23 / 23