Kernel Generalized Canonical Correlation Analysis

Similar documents
Best linear unbiased prediction when error vector is correlated with other random vectors in the model

A new simple recursive algorithm for finding prime numbers using Rosser s theorem

Widely Linear Estimation with Complex Data

Thomas Lugand. To cite this version: HAL Id: tel

Methylation-associated PHOX2B gene silencing is a rare event in human neuroblastoma.

Case report on the article Water nanoelectrolysis: A simple model, Journal of Applied Physics (2017) 122,

Full-order observers for linear systems with unknown inputs

Easter bracelets for years

Numerical modification of atmospheric models to include the feedback of oceanic currents on air-sea fluxes in ocean-atmosphere coupled models

Smart Bolometer: Toward Monolithic Bolometer with Smart Functions

Stickelberger s congruences for absolute norms of relative discriminants

On the diversity of the Naive Lattice Decoder

Analysis of Boyer and Moore s MJRTY algorithm

Finite volume method for nonlinear transmission problems

Vibro-acoustic simulation of a car window

Passerelle entre les arts : la sculpture sonore

Multiple sensor fault detection in heat exchanger system

From Unstructured 3D Point Clouds to Structured Knowledge - A Semantics Approach

ON THE UNIQUENESS IN THE 3D NAVIER-STOKES EQUATIONS

Can we reduce health inequalities? An analysis of the English strategy ( )

Numerical Modeling of Eddy Current Nondestructive Evaluation of Ferromagnetic Tubes via an Integral. Equation Approach

L institution sportive : rêve et illusion

Comment on: Sadi Carnot on Carnot s theorem.

b-chromatic number of cacti

Solving the neutron slowing down equation

A proximal approach to the inversion of ill-conditioned matrices

Evolution of the cooperation and consequences of a decrease in plant diversity on the root symbiont diversity

On Symmetric Norm Inequalities And Hermitian Block-Matrices

Sensitivity of hybrid filter banks A/D converters to analog realization errors and finite word length

On size, radius and minimum degree

The Core of a coalitional exchange economy

Methods for territorial intelligence.

Fast Computation of Moore-Penrose Inverse Matrices

Dispersion relation results for VCS at JLab

Spatial representativeness of an air quality monitoring station. Application to NO2 in urban areas

Hook lengths and shifted parts of partitions

Unbiased minimum variance estimation for systems with unknown exogenous inputs

Stochastic invariances and Lamperti transformations for Stochastic Processes

A non-commutative algorithm for multiplying (7 7) matrices using 250 multiplications

A new approach of the concept of prime number

The FLRW cosmological model revisited: relation of the local time with th e local curvature and consequences on the Heisenberg uncertainty principle

Capillary rise between closely spaced plates : effect of Van der Waals forces

The constrained stochastic matched filter subspace tracking

Towards an active anechoic room

Electromagnetic characterization of magnetic steel alloys with respect to the temperature

Some explanations about the IWLS algorithm to fit generalized linear models

Completeness of the Tree System for Propositional Classical Logic

A NON - CONVENTIONAL TYPE OF PERMANENT MAGNET BEARING

GENERALIZED OPTICAL BISTABILITY AND CHAOS IN A LASER WITH A SATURABLE ABSORBER

Water Vapour Effects in Mass Measurement

On the longest path in a recursively partitionable graph

Linear Quadratic Zero-Sum Two-Person Differential Games

Stator/Rotor Interface Analysis for Piezoelectric Motors

Soundness of the System of Semantic Trees for Classical Logic based on Fitting and Smullyan

Dorian Mazauric. To cite this version: HAL Id: tel

Accurate critical exponents from the ϵ-expansion

Characterization of Equilibrium Paths in a Two-Sector Economy with CES Production Functions and Sector-Specific Externality

On Symmetric Norm Inequalities And Hermitian Block-Matrices

Analyzing large-scale spike trains data with spatio-temporal constraints

Basic concepts and models in continuum damage mechanics

On The Exact Solution of Newell-Whitehead-Segel Equation Using the Homotopy Perturbation Method

Dissipative Systems Analysis and Control, Theory and Applications: Addendum/Erratum

Solution to Sylvester equation associated to linear descriptor systems

Understanding SVM (and associated kernel machines) through the development of a Matlab toolbox

A simple birth-death-migration individual-based model for biofilm development

Replicator Dynamics and Correlated Equilibrium

Two-step centered spatio-temporal auto-logistic regression model

A Simple Proof of P versus NP

On Poincare-Wirtinger inequalities in spaces of functions of bounded variation

Cutwidth and degeneracy of graphs

Trajectory Optimization for Differential Flat Systems

Quantum efficiency and metastable lifetime measurements in ruby ( Cr 3+ : Al2O3) via lock-in rate-window photothermal radiometry

Exogenous input estimation in Electronic Power Steering (EPS) systems

RENORMALISATION ON THE PENROSE LATTICE

On Newton-Raphson iteration for multiplicative inverses modulo prime powers

Quasi-periodic solutions of the 2D Euler equation

Entropies and fractal dimensions

IMPROVEMENTS OF THE VARIABLE THERMAL RESISTANCE

A Slice Based 3-D Schur-Cohn Stability Criterion

STATISTICAL ENERGY ANALYSIS: CORRELATION BETWEEN DIFFUSE FIELD AND ENERGY EQUIPARTITION

Solving an integrated Job-Shop problem with human resource constraints

Classification of high dimensional data: High Dimensional Discriminant Analysis

Chebyshev polynomials, quadratic surds and a variation of Pascal s triangle

A remark on a theorem of A. E. Ingham.

Hardware Operator for Simultaneous Sine and Cosine Evaluation

Theoretical calculation of the power of wind turbine or tidal turbine

Exact Comparison of Quadratic Irrationals

Solution of contact problems with friction

Posterior Covariance vs. Analysis Error Covariance in Data Assimilation

Eddy-Current Effects in Circuit Breakers During Arc Displacement Phase

Interactions of an eddy current sensor and a multilayered structure

Irregular wavy flow due to viscous stratification

Climbing discrepancy search for flowshop and jobshop scheduling with time-lags

The Learner s Dictionary and the Sciences:

Eddy Current Modelling for Inspection of Riveted Structures in Aeronautics

High resolution seismic prospection of old gypsum mines - evaluation of detection possibilities

Territorial Intelligence and Innovation for the Socio-Ecological Transition

Question order experimental constraints on quantum-like models of judgement

0.9 ev POTENTIAL BARRIER SCHOTTKY DIODE ON ev GAP GaxIn1-xASSi:H

New Basis Points of Geodetic Stations for Landslide Monitoring

Transcription:

Kernel Generalized Canonical Correlation Analysis Arthur Tenenhaus To cite this version: Arthur Tenenhaus. Kernel Generalized Canonical Correlation Analysis. JdS 10, May 2010, Marseille, France. CD-ROM Proceedings (6 p.), 2010. <hal-00553602> HAL Id: hal-00553602 https://hal-supelec.archives-ouvertes.fr/hal-00553602 Submitted on 7 Jan 2011 HAL is a multi-disciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Kernel Generalized Canonical Correlation Analysis Arthur Tenenhaus SUPELEC Sciences des Systèmes (E3S)-Département Signaux et Systèmes Electroniques 3, rue Joliot Curie, Plateau de Moulon, 91192 Gif-sur-Yvette Cedex Résumé Un problème classique en statistique est d étudier les liens entre plusieurs blocs de variables. L obectif est de trouver les variables d un bloc influençant les variables d autres blocs. L Analyse Canonique Généralisée Régularisée (ACGR) est un cadre très attractif pour traiter ce type de problématique. Cependant, l ACGR ne capture que des relations linéaires entre blocs et pour accéder à des liens non linéaires nous proposons une extension à noyau de l ACGR. Mots-clés: Méthodes à noyau, Analyse de tableaux Multiples, Analyse Canonique Généralisée Régularisée, Approche PLS Abstract A classical problem in statistics is to study relationships between several blocs of variables. The goal is to find variables of one bloc directly related to variables of other blocs. The Regularized Generalized Canonical Correlation Analysis (RGCCA) is a very attractive framewor to study such a ind of relationships between blocs. However, RGCCA captures linear relations between blocs and to assess nonlinear relations we propose a ernel extension of RGCCA. Keywords: Kernel Methods, Multibloc Data Analysis, Regularized Generalized Canonical Correlaton Analysis, PLS approach 1 Introduction A common problem in applied statistics is to relate several blocs of variables to each other in order to find variables of one bloc directly related to variables of other blocs. Typical examples are found in large variety of fields such as bioinformatics, sensory analysis, mareting, food research... To study such a ind of relationships between blocs, the starting point of the paper is the generalized canonical correlation analysis (gcca) proposed in [4]. This version of gcca captures linear relationships between blocs and to assess nonlinear relations, we propose in this paper a ernel extension of gcca. The paper is organized as follows: the first part presents the initial formulation of gcca and the second part is devoted to its nonlinear version. 1

2 Population generalized canonical correlation analysis Let us consider J random p -dimensional centered column vectors x and J p -dimensional non random column vectors ξ. We also consider a networ of connections between the random vectors by defining a design matrix C = (c ) : c = 1 if x and x are connected and 0 otherwise. Now consider two linear combinations η = ξ t x and η = ξ t x. The correlation between η and η is : ρ(ξ t x, ξ t x ) = ξ t Σ ξ (ξ t Σ ξ ) 1/2 (ξ t Σ ξ ) 1/2 (1) where Σ = E ( x x) t and Σ = E (x x t ). The population generalized canonical correlation analysis is defined as the following optimization problem [4]: 1 < J c g ( ρ(ξx t, ξ tx ) ) ξ 1,ξ 2,...,ξ J (2) s.c. var(ξx t ) = 1, = 1,..., J where g is the identity, the absolute value or the square function. Problem (2) is equivalent to the following optimization problem: 1 < J c g ( ) ξσ t ξ ξ 1,ξ 2,...,ξ J (3) s.c. ξσ t ξ = 1, = 1,..., J By considering the derivates with respect to ξ and λ of the Lagrangian function associated to optimization problem (3), we obtain J stationnary equations : 1 ϕ Σ 1 =1, c g (ξ t Σ ξ )Σ ξ = λ ξ, = 1,..., J (4) subect to the constraints: ξ t Σ ξ = 1, = 1,..., J (5) 2.1 Regularized Generalized Canonical Correlation Analysis In practice, we have to estimate the ξ s from a finite sample. Let s consider J blocs X 1,..., X J of centered variables measured on a set of n individuals (a row of X represents a realization of the row-random vector x t ). C = {c } is now a design matrix describing a networ of relationships between blocs: c = 1 for two connected blocs, and 0 otherwise. In the case of high multi-colinearity or when the number of observations is 2

smaller than the number of variables, the sample covariance matrix S = 1 n Xt X is a bad estimation of the true covariance matrix Σ. A suggestion for finding a better estimation of the true covariance matrix is to consider the class of linear combinations {Ŝ = τ I + (1 τ )S } of the identity matrix I and the sample covariance matrix S [3]. We then consider a sample version of stationnary equations (4) with the constraints (5) in replacing Σ by Ŝ and Σ by S = 1 n Xt X. This leads to J sample stationary equations which are the stationary equations associated to the regularized generalized canonical correlation analysis (RGCCA) defined below: 1 < J c g(ĉov(x a, X a )) a 1,a 2,...,a J (6) s.c. (1 τ ) var(x a ) + τ a 2 = 1, = 1,..., J The regularisation parameters τ [0, 1], = 1,..., J interpolate smoothly between the maximisation of the covariance (all τ s = 1) and the maximisation of the correlation (all τ s = 0). More details on RGCCA can be found in [4]. This formulation of RGCCA detects only linear relations between blocs and in the next section, we introduce a ernel extension of RGCCA allowing extracting nonlinear relations between blocs. 3 Kernel generalized Canonical Correlation Analysis For each = 1,..., J, let H be a Reproducing Kernel Hilbert space (RKHS) with associated ernel (, ) and feature map Φ (x) = (, x). We define Kernel Generalized Canonical Correlation Analysis (KGCCA) as the following optimization problem. where argmax 1 < J c g (ρ (f (x ), f (x ))) f 1,...,f J H 1... H J s.c. var(f (x )) = 1, = 1,..., J ρ(f (x ), f (x )) = (7) cov(f (x ), f (x )) var(f (x )) 1/2 var(f (x )) 1/2 (8) is the correlation between the random variables f (x ) and f (x ). The functions f and f are decided up to scale. From the reproducing property of RKHS, ρ(f (x ), f (x )) = cor( Φ (x ), f, Φ (x ), f ). Therefore, ρ(f (x ), f (x )) is the correlation between one dimensional linear proection of Φ (x ) onto f and Φ (x ) onto f. The canonical correlation between Φ (x ) and Φ (x ) is exactly the maximal possible correlation between these two proections. Now, the main obective is to estimate f 1,..., f J from a finite sample. For each = 1,..., J, let {x 1,..., x n } be a set of n observations of x, K be the associated Gram matrix defined as ( K ) rs = (x r, x s ) and K be the Gram matrix of the centered data points in the 3

feature space defined as K = P K P where P = I 1 1, where 1 is a n n matrix n composed of 1. For fixed f and f, it can be shown [1] that the empirical covariance of the proections in the feature space can be written as: ĉov ( Φ (x ), f, Φ (x ), f ) = 1 n αt K K α (9) where K and K are the Gram matrices associated with the data sets {x i } n i=1 and {x i }n i=1 respectively. Combining optimization problem (7) with equation (9) and adding constraint on the smoothness of the f through f H = α t K α, the empirical counterpart of KGCCA becomes that of performing the following maximization : 1 < J c g(ĉov(k α, K α )) α 1,...,α J ( ) (10) s.c. (1 τ ) 1 n K2 + τ K α = 1, = 1,..., J α t The regularisation parameters not only maes this optimisation problem well posed numerically and statistically ([1];[2]) but also provide control over the capacity of the function space where the solution is sought. The larger the values of τ are, the less sensitive the method to the input data is and the more stable (less prone to finding spurious relations) the solution becomes. Since the Gram matrices K, = 1,..., J are symmetric semidefinite, an (incomplete) cholesy decomposition is applied to solve optimimization problem (10) efficiently. Therefore, K = R t R where R is a ran(k ) = n n upper triangular matrix. Therefore, by noting w = R α optimization problem (10) becomes 1 < J c g(ĉov(r t w, R t w )) w 1,...,w J ( s.c. (1 τ ) 1 R ) (11) n R t + τ I n w = 1, = 1,..., J w t A new procedure is proposed to solve (11). The first step of the procedure is to cancel the derivatives of the Lagrangian function related to the maximization problem (11) with respect to w and λ. This yields to the following J stationary equations: 1 ϕ N 1 =1, subect to the constraints: c g ( 1 n wt R R t w )R R t w = λ w, = 1,..., J (12) w t N w = 1, = 1,..., J (13) where N = (1 τ ) 1 n R R t + τ I n. From the PLS terminology, we introduce for each bloc an outer component y = R t w and an inner component z defined as follows: z = 1 ϕ =1, c g (ĉov(y, y )) y (14) 4

Then combining equations (12), (14) and (13), we obtain the outer weights: w = [ ] z t R t N 1 1/2 R z N 1 R z, = 1,..., J (15) We then propose an iterative PLS style algorithm for finding a solution of the J stationary equations (see Table 1). This algorithm is monotonically convergent that means that the bounded criterion to be maximized is increasing at each step of the procedure (the proof of this result is beyond the scope of the paper). Table 1: Kernel Generalized Canonical Correlation Analysis Algorithm A. Initialisation A.1. Choose J arbitrary vectors w (0) 1,..., w (0) J A.2. Compute vectors w (0) 1,..., w (0) J [ w (0) = ( w (0) verifying the constraints as: ] 1/2 ) t N 1 w (0) N 1 w (0) A.3. Compute the outer components : y (0) 1 = R t 1w (0) 1,..., y (0) J = R t J w(0) J B. Inner component for the bloc X i Compute the inner component according to the choice og g: z (s) = 1 ϕ [ 1 =1 ( c g [cov y (s), y (s+1) )] y (s+1) + =+1 ( c g [cov y (s) )], y (s) where g (x)/ϕ = 1 for g equal to the identity, x for g equal to the square and sign(x) for g equal to the absolute value. C. Outer component for the bloc C.1. Compute the outer weight: w (s+1) = C.2. Compute the outer component: y (s+1) [ (z (s) ) t R t N 1 = R t w (s+1) ] 1/2 R z (s) N 1 y (s) R z (s) ] The procedure begins by an arbitrary choice of initialisation (A). Suppose the outer components y (s+1) 1, y (s+1) 2,..., y (s+1) 1 are constructed for the blocs 1,..., 1. The outer component y (s+1) is computed by considering the inner component z (s) for the bloc (B), then by considering the outer one (C). The procedure is iterated until convergence. 5

The essential feature of this algorithm is that each replacement is optimal, and sequential, that is to say that w (s) must be replaced by w (s+1) before replacing w (s) +1. This is the essence of the Gauss-Seigel Algorithm; this sequential approach leads to the monotone convergence of this algorithm. 4 Conclusion KGGCA is a very attractive framewor for non linear multibloc data analysis and covers a large spectrum of existing methods depending on the values of the regularisation parameters. Moreover, the proposed algorithm is computationally efficient but have some limitations: (1) There is no guarantee that the algorithm converges towards a fixed point of the stationary equations. (2) There is no guarantee that the algorithm converges towards a global optimum of the criterion. References [1] F. R. Bach and M. I. Jordan. Kernel independent component analysis. Journal of Machine Learning Research, 3:1 48, 2003. [2] K. Fuumizu, F. R. Bach, and A. Gretton. Statistical consistency of ernel canonical correlation analysis. Journal of Machine Learning Research, 8:361 383, 2007. [3] O. Ledoit and M. Wolf. A well-conditioned estimator for large-dimensional covariance matrices. Journal of Multivariate Analysis, 88(2):365 411, 2004. [4] A. Tenenhaus and M. Tenenhaus. Regularized Generalized Canonical Correlation Analysis. Psychometria, in revision. 6