Regularized Generalized Canonical Correlation Analysis Extended to Symbolic Data

Size: px
Start display at page:

Download "Regularized Generalized Canonical Correlation Analysis Extended to Symbolic Data"

Transcription

1 Noname manuscript No. (will be inserted by the editor) Regularized Generalized Canonical Correlation Analysis Extended to Symbolic Data Received: date / Accepted: date Abstract Regularized Generalized Canonical Correlation Analysis (RGCCA) is a component-based approach which aims at studying the relationship between several blocks of numerical variables. In this paper we propose a method called Symbolic Generalized Canonical Correlation Analysis (Symbolic GCCA) that extends RGCCA to symbolic data. It is a versatile tool for multi-block data analysis that can deal with any type of datasets (e.g. observations described by intervals, histograms,...) provided that a relevant kernel function is defined for each block. A monotonically convergent algorithm for symbolic GCCA is presented and applied to a 4-block dataset of power plants cooling towers described by histograms. Keywords Symbolic Data, Regularized Generalized Canonical Correlation Analysis, kernel functions 1 Introduction The modern problem of managing and analyzing massive amounts of data does not concern just the problem of dataset size, it also concerns that of dealing with data that can be more or less complex. On the web site of ECML/PKDD 2007 on Mining Complex Data, complex data have been defined as follows: in contrast to the typical tabular data, complex data can consist of heterogeneous data types, can come from different sources, or live in high dimensional spaces. All these specificities call for new data mining strategies. Actually, the complexity can come from the object description itself as in the case of images, videos, audio/text documents or from the dataset structure such as in the case of distributed data, heterogeneous data or spatio-temporal data. A typical example is a medical patient who can be described by heterogeneous data such as images, text documents and socio-demographic information. In

2 2 practice, complex data are more or less based on several kinds of observations described by standard numerical and/or categorical data contained in several related data tables. Usually, data mining deals with observations that are described by several standard variables including numerical or categorical ones. Symbolic data analysis (SDA) [1], [2] deals with concepts that are in general define from the original observations. These concepts are described by symbolic data which can be standard categorical or numerical data but also more complex descriptions such as sets, sequences of weighted values, intervals, histograms and the like. The word symbolic is used as the more complex descriptions cannot be manipulated just as real numbers. We will proceed as follows. Starting from complex data as the one describing power plants cooling towers that will illustrate our present study, a fusion process is applied to get a symbolic data table from which new type knowledge will be discovered using some specific methodological tools extended to concepts that are considered as a new type of observations. The present paper investigates an extension to symbolic data of Regularized Generalized Canonical Correlation Analysis (RGCCA) introduced in [22]. RGCCA is itself a generalization of regularized CCA [27], [15] to the more than two-block case and offers a unifying view of various multi-block data analysis methods. The paper is organized as follows: RGCCA is briefly introduced in section 2 and a monotone convergent algorithm for RGCCA is proposed. Then, using appropriate positive definite kernel functions we perform this new version of RGCCA on symbolic data. Examples of kernel functions for symbolic data are discussed in section 3. Finally, section 4 illustrates the usefulness of symbolic GCCA on a 4-block dataset for studying power plants cooling towers described by histograms. 2 Regularized Generalized Canonical Correlation Analysis Regularized Generalized Canonical Correlation Analysis (RGCCA) proposed in [22] is a method for studying associations between more than two blocks of variables. RGCCA aims at extracting the information shared by J blocks of centered variables X 1,..., X J taking into account an a priori graph of connections between blocks specified by a binary design matrix C = {c jk } such that c jk = 1 if blocks X j and X k are connected and c jk = 0 otherwise. RGCCA is defined as the following optimization problem (1): { J max a j,k=1:j k c jk g (cov(x j a j, X k a k )) 1,a 2,...,a J (1) subject to (1 τ j )var(x j a j ) + τ j a j j = 1, j = 1,..., J In this optimization problem, g can be defined as g(x) = x (Horst scheme proposed in [14]), g(x) = x (Centroid scheme proposed in [29]) or g(x) = x 2

3 RGCCA Extended to Symbolic Data 3 (Factorial scheme proposed in [16]). The vector a j (resp. y j = X j a j ) is referred to as an outer weight vector (resp. an outer component). The Horst scheme penalizes structural negative correlations between components while both Centroid and Factorial schemes can be viewed as attractive alternatives that enable two components to be negatively correlated. From an optimization point of view, the shrinkage parameters τ j [0, 1], j = 1,..., J in (1) smoothly interpolate between the maximization of the covariance (τ j = 1 for all j) and the maximization of the correlation (τ j = 0 for all j). Optimization problem (1) is solved using the algorithm described below in Algorithm 1. Let us denote by K j = X j X t j, the n n matrix of inner products between observations. The procedure begins by an arbitrary choice of initial values Algorithm 1 Dual algorithm for Regularized Generalized Canonical Correlation Analysis Step A. Initialisation Choose J arbitrary vectors α 0 1,..., α0 J such that the constraints in (1) hold: repeat s = 1, 2,... for j = 1, 2,..., J do [ α 0 j = ( α 0 j [τ )t j I + (1 τ j ) 1 ] ] 1/2 n K j K t j α0 j α 0 j Step B. Inner component for block j j 1 z s j = k=1 [ ( c jk w cov K j α s j, K kα s+1 k )] K k α s+1 k + J k=j+1 c jk w [ cov ( K j α s j, K kα s k)] Kk α s k where w(x) = 1 for the Horst scheme, = x for the factorial scheme and = sign(x) for the centroid scheme. Step C. Dual outer component for block j α s+1 j = end for until convergence [(z sj )t K j [ τ j I + (1 τ j ) 1 n K j ] ] 1 1/2 [ z s j τ j I + (1 τ j ) 1 ] 1 n K j z s j α 0 1,..., α 0 J (Step A(a) in Algorithm 1). Assuming that dual outer weight vectors α s+1 1, α s+1 2,..., α s+1 j 1 are constructed for blocks X 1, X 2,..., X j 1, the dual outer weight vector α s+1 j is computed by considering the inner component z s j for block X j given in Step B(a) in Algorithm 1, and the formula given in Step C(a) in Algorithm 1. The procedure is iterated until convergence. Indeed, the convergence is proved in [21] as the bounded criteria to be maximized increases at each step of the iterative procedure until reaching a plateau. We stress that Algorithm 1 is not the original (primal) formulation of the

4 4 RGCCA algorithm presented in [22] but rather a dual formulation which was originally proposed in [21]. The key difference between the primal and the dual formulation relies on the fact that it is always possible to express a j as a linear combination of the observations of X j, that is a j = X t j α j. Moreover, α j is optimized in the dual formulation while a j is optimized in the primal. For more details on the dual formulation we refer interested readers to [21]. As will be seen, the dual formulation better fits the handlingof symbolic data. In general, Algorithm 1 does not necessarily converge to the global optimum and restarts are needed for getting a global optimum solution. Empirically, we note that Algorithm 1 is found not to be very sensitive to the starting point and usually convergence (tolerance = ) is reached within a few iterations. It was assumed without loss of generality that the variables were centered but actually, data K j can be centered by simply applying the following transform: K j = (I 1 n 11t )K j (I 1 n 11t ) (2) where I denotes the n-dimensional identity matrix and 1 the n-vector of ones. Moreover, we stress that only first dimension components are built in Algorithm 1. Components related to other dimensions can be easily obtained using the same procedures on deflated blocks with respect to the preceding dimension components. The deflation operation can be obtained from K j after extraction of y j using the following formula: K j (I y j y t j)k j (I y j y t j) (3) As a conclusion of this section, observe that Algorithm 1 solves an optimization problem (1) by manipulating the observations just through pairwise inner products between observations. As it will be seen in section 3, symbolic GCCA will be derived from Algorithm 1 by designing appropriate pairwise inner products between symbolic data. 2.1 Special cases of RGCCA To define the design matrix, the shrinkage parameters and the function g, an overview of methods that constitute the RGCCA framework is summarized in Table 1.

5 RGCCA Extended to Symbolic Data 5 Table 1 Special cases of RGCCA: PLS regression [30], canonical correlation analysis [12], redundancy analysis [25], regularized CCA [27], [15], [18], regularized redundancy analysis [20], SUMCOR [11], SSQCOR [13], SABSCOR [10], SUMCOV [26], SSQCOV [10], SABSCOV [14], Caroll s CCA [6], Multiple Co-Inertia Analysis [7]. XJ+1 is called a super-block and is equal to the concatenation of blocks X1,..., XJ (XJ+1 = [ ] X1 X2... XJ ). TWO-BLOCK CASE METHOD CRITERION PLS regression argmax a1,a2 Canonical Correlation Analysis argmax a1,a2 Redundancy analysis argmax a1,a2 regularized CCA argmax a1,a2 regularized Redundancy analysis argmax a1,a2 cov(x1a1, X2a2) s.c. a1 = a2 = 1 cov(x1a1, X2a2) s.c. var(x1a1) = var(x2a2) = 1 cov(x1a1, X2a2) s.c. a1 = var(x2a2) = 1 MULTI-BLOCK CASE METHOD CRITERION cov(x1a1, X2a2) s.c. (1 τj)var(xjaj) + τj aj 2 = 1, j = 1, 2 cov(x1a1, X2a2) s.c. a1 = (1 τ2)var(x2a2) + τ2 a2 2 = 1 SUMCOR a1,...,a J argmax J cov(x jaj, Xkak) s.c. var(xjaj) = 1, j = 1,..., J SSQCOR argmax j,k=1:j k cov(x a1,...,a J J (Xjaj, Xkak) s.c. var(xjaj) = 1, j = 1,..., J j,k=1:j k cov2 a1,...,a J SABSCOR argmax J cov(x j,k=1:j k jaj, Xkak) s.c. var(xjaj) = 1, j = 1,..., J SUMCOV argmax a1,...,a J J cov(x jaj, Xkak) s.c. aj = 1, j = 1,..., J SSQCOV argmax j,k=1:j k cov(x a1,...,a J J cov2 (Xjaj, Xkak) s.c. aj = 1, j = 1,..., J j,k=1:j k cov2 a1,...,a J SABSCOV argmax J cov(x j,k=1:j k jaj, Xkak) s.c. aj = 1 j = 1,..., J Carroll s CCA argmax J j=1 cov2 (Xjaj, XJ+1aJ+1) s.c. a1,...,a J var(xjaj) = 1, j = 1,..., J1, J + 1 aj = 1, j = J1 + 1,..., J MCOA argmax J j=1 cov2 (Xjaj, XJ+1aJ+1) s.c. aj = var(xj+1aj+1) = 1, j = 1,..., J a1,...,a J

6 6 Multi-block data analysis. In multi-block data analysis, all blocks X j, j = 1,..., J are assumed to be connected and many criteria were proposed in the literature with the objective of finding block components satisfying some kind of optimality: some are based on correlation others on covariance. Table 1 reports the mains ones: (i) SUMCOR (SUM of the CORrelation) [11]), (ii) SSQCOR (Sum of SQuare of the CORrelation) [13], (iii) SABSCOR (Sum of the ABsolute value of the CORrelation)) [10], (iv) SUMCOV (sum of the covariance) [26], (v) SSQCOV (Sum of square of the covariance) [10], (vi) SABSCOV (Sum of the ABSolute value of the COVariance) [14]. Regularized Canonical Correlation Analysis. Regularized Canonical Correlation Analysis [27], [15], [18] is defined as the following optimization problem: For various extreme cases τ 1 = 0 or 1 and τ 2 = 0 or 1 (which correspond exactly to the framework described in [3] and [5]), regularized CCA covers a situation which goes from Tucker s interbattery factor analysis [24] to Canonical Correlation Analysis [12] while passing through redundancy analysis [25]. The special case 0 τ 1 1 and τ 2 = 0 which corresponds to a regularized version of redundancy analysis has been studied in [20] and [4]. Hierarchical model. To conclude, we stress that the introduction of the design matrix allows analyzing complex structural relationships between blocks such as hierarchical models that have been introduced in [28]. Hierarchical models will be illustrated in section 4. It is quite remarkable that single RGCCA algorithm described in Algorithm 1 offers a unfying view of all the methods reported in Table 1. This is of practical interest for unified statistical analysis and unified implementation strategies. Further, in the next section we show how to extend all the methods reported in Table 1 with Algorithm 1 applied to symbolic data. 3 Kernels for symbolic data Symbolic GCCA is a versatile tool for multi-block data analysis that allows handling any type of datasets (e.g. observations described by histograms, intervals, strings, images...) as long as a relevant kernel function can be defined for each block [9]. Using kernels on such a structured symbolic dataset usually involves first choosing a similarity measure between pairs of symbolic objects for each block and next transforming these n n similarity matrices into symmetric and positive definite n n matrices called kernel matrices. Thus, symbolic GCCA can be reduced to a two steps procedure: Step 1. Design for each block a kernel function that will encode the proximity of symbolic data. For instance, the euclidean distance is usually chosen for computing distances between histograms and Laplacian transformation can be used to form the

7 RGCCA Extended to Symbolic Data 7 kernel matrix: p j k(x ij, x il ) = exp( γ x ijh x ilh ) (4) where x ij is the set of symbolic measurements observed on the jth individual for the ith block and where x ijh is the hth symbolic measurement observed on the jth individual for the ith block. The positive definitness of the kernel matrix guarantees that each element of the kernel matrix corresponds to a pairwise inner products evaluation in some space induced by the kernel function (see for instance [19] or [9] for more details on kernel theory). Step 2. Symbolic GCCA consists in calculating the kernel matrix associated with each block j: [ K j ] kl = k j (x jk, x jl ) and replacing in Algorithm 1, K j = X j X t j with K j. Notice that index j appears in k j to emphasize that kernel functions may differ from one block to another according the nature of the block. h=1 4 CASE STUDY: Nuclear power plant cooling datatest In order to analyze the degradations of cooling towers, French energy company (EDF) has collected surveying data since their construction [8]. Twenty-one cooling towers are described by 10 different histograms related to subsidence, geometric deformation, cracks and corrosion. 1. Geometric deformation is described by 3 histograms of the external hyperbolic shapes of the towers (Ecartabs t 2, Ecartabs 1 2, Ecartabs t 1). 2. Cracks information is described by 2 histograms of the length and the orientation of the cracks of the towers (longfi H 2, Orientation 2). 3. Corrosion level is described by the histogram of the length of the corrosion of the towers (longco H 2). 4. Subsidence of the towers is described by 4 histograms (TAS 10ansH, TAS TotalH, TAS Diff, TAS 20ansH) Each individual histogram is called a first-order block. Four second-order blocks (Geometric, Cracks, Corrosion, Subsidence) are built by concatenating individual histograms as follows: - Geometric contains Ecartabs t 2, Ecartabs 1 2 and Ecartabs t 1 - Cracks contains longfi H 2 and Orientation 2 - Corrosion contains longco H 2 - Subsidence contains TAS 10ansH, TAS TotalH, TAS Diff, TAS 20ansH. Finally, a superblock concatenating all second-order blocks is also considered. Detecting abnormal degradation levels for some towers and determining the relational connections between measures at the tower level are necessary to

8 8 understand physical phenomena. Therefore, to perform an accurate analysis of the cooling towers degradation, the four categories of information (geometric, cracks, corrosions, subsidence) have to be considered simultaneously with Symbolic GCCA. 4.1 Results The R software was used to perform our analysis [17]. Symbolic GCCA combined with the hierarchical model described in Figure 1 was used. The hyperparameters (τ j and γ) were set to 1 and.1, respectively, for all blocks. Two Ecartabs_t_2 Ecartabs_1_2 Ecartabs_t_1 Geometric Corrosion (longco_h_2) Geometric Corrosion Cracks Subsidence longfi_h_2 Orientation_2 Cracks TAS_10ansH TAS_TotalH TAS_Diff TAS_20ansH Subsidence Fig. 1 Hierarchical model components have been constructed for each block. The graphical display of the towers obtained by crossing the two first components of the superblock (y 1 and y 2 ) is shown in Figure 2. It may be observed that the left part of the graphical display concentrates the most damaged towers while the right part concentrates the less damaged ones. Figure 3 is built by computing correlations between the first component of each block and the two first components of the super-block, y 1 and y 2. Analyzing these data is of interest for emphasizing correlations between the different blocks. This can be used to anticipate some degradation problems that may occur. For example, civil engineers need to know which extent subsidence problems can cause other degradation problems (geometric deformation, cracks, corrosion areas). In this context, it is interesting to observe (see Figure 3) that the analysis indicates a clear separation of the geometric and subsidence blocks from the two other blocks (cracks and corrosion areas) and this brings out two clusters of towers regarding degradation problems (see Figure 2). The first cluster includes towers with both problems of subsidence and geometric

9 RGCCA Extended to Symbolic Data 9 Nuclear power plant cooling ower17 tower5 tower3 second global component tower2 tower15 tower14 tower1 tower16 tower12 tower8 tower13 tower9 tower10 tower11 tower19 tower7 tower21 tower20 tower6 tower4 tower first global component Fig. 2 Factorial plan (y 1, y 2 ) where y 1 and y 2 are block components related to the superblock. deformation. The second cluster includes towers with both problems of cracks and corrosion areas. Figure 4 depicts relational connections between blocks. Correlations between the first components of each block are reported. From Figure 4, it seems that the relationships between subsidence and geometry is stronger than the ones between corrosion and cracks. 5 Conclusion and perspectives Symbolic GCCA method introduced in this paper extends a large number of well-known data analysis methods to the symbolic context by simply choosing some appropriate kernel functions. We observe that symbolic GCCA algorithm is computationally efficient regarding the number of iterations to reach convergence. However there is no guarantee that the algorithm converges towards a global optimum of the criterion and a restart strategy can be used for getting a global optimum solution. A bootstrap method providing confidence intervals can be performed in order to indicate how reliable are the correlations esti-

10 10 correlation circle CRACKS Orientation CORROSION longco nb_corrosion1 nb_crack2 nb_corrosion2 nb_crack1 longfi Ecartabs_t1 CRACKS GEOMETRIC Ecartabs_12 GEOMETRIC CORROSION Ecartabs_t2 TAS_Tot TAS_Dif TAS_10 SUBSIDENCE TAS_ Fig. 3 Correlation circle mated with symbolic GCCA. Last, the choice of the hyperparameters (γ and τ j ) can be determined from a standard cross-validation procedure. 6 Acknowledgments This work was supported by a grant from the French National Reseach Agency (ANR Investissement d Avenir BRAINOMICS; grant ANR-10-BINF-04). References 1. Billard, L. and Diday, E., Symbolic Data Analysis: conceptual statistics and data mining. Wiley, Diday, E. and Noirhomme, M. Symbolic Data Analysis and the SODAS software. Wiley, M. Borga, T. Landelius, and H. Knutsson. A Unified Approach to PCA, PLS, MLR and CCA. Technical report, 1997.

11 RGCCA Extended to Symbolic Data 11 Ecartabs_t_ Ecartabs_1_2 Corrosion (longco_h_2).599 Geometric.979 Ecartabs_t_ Subsidence.501 Cracks.933 TAS_10ansH TAS_TotalH TAS_Diff TAS_20ansH Orientation_2 longfi_h_2 Fig. 4 diagram of relationships between blocks 4. S. Bougeard, M. Hanafi, and E.M. Qannari. Continuum redundancy-pls regression: a simple continuum approach. Computational Statistics and Data Analysis, 52: , A.J. Burnham, R. Viveros, and J.F. MacGregor. Frameworks for latent variable multivariate regression. Journal of Chemometrics, 10:3145, Carroll, J.D. A generalization of canonical correlation analysis to three or more sets of variables. In Proc. 76th conv. Am. Psych. Assoc , Chessel, D., and Hanafi, M. Analyse de la co-inertie de K nuages de points. Revue de Statistique Applique, 44, 3560, Courtois, A., Genest, Y., Afonso, F., Diday, E., Orcesi A., In service inspection of reinforced concrete cooling towers EDFs feedback, IALCEE Cuturi, M. Positive definite kernels in machine learning. arxiv preprint arxiv: , M. Hanafi and H.A.L. Kiers. Analysis of K sets of data, with differential emphasis on agreement between and within sets. Computational Statistics and Data Analysis, 51: , P. Horst. Relations among m sets of variables. Psychometrika, 26:126149, H. Hotelling. Analysis of a complex of statistical variables into principal components. Journal of Educational Psychology, 24:417441, J.R. Kettenring. Canonical analysis of several sets of variables. Biometrika, 58:433451, N. Kramer. Analysis of high-dimensional data with partial least squares and boosting. In Doctoral dissertation, Technischen Universitat Berlin, S.E. Leurgans, R.A. Moyeed, and B.W. Silverman. Canonical correlation analysis when the data are curves. Journal of the Royal Statistical Society. Series B, 55:725740, J.B. Lohmöller. Latent Variables Path Modeling with Partial Least Squares. Physica- Verlag, Heildelberg, R Core Team, R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL

12 J. Shawe-Taylor and N. Cristianini. Kernel Methods for Pattern Analysis. Cambridge University Press, New York, NY, USA, Scholkopf, B., and Smola, A.J. Learning with kernels: support vector machines, regularization, optimization and beyond. the MIT Press Y. Takane and H. Hwang. Regularized linear and kernel redundancy analysis. Computational Statistics and Data Analysis, 52:394405, Tenenhaus, A., Philippe, C., and Frouin, V., Kernel Generalized Canonical Correlation Analysis. Technical report, A. Tenenhaus and M. Tenenhaus. Regularized Generalized Canonical Correlation Analysis. Psychometrika, 76:257284, M. Tenenhaus, V. Esposito Vinzi, Y.-M. Chatelin, and C. Lauro. PLS path modeling. Computational Statistics and Data Analysis, 48:159205, L.R. Tucker. An inter-battery method of factor analysis. Psychometrika, 23:111136, A.L. Van den Wollenberg. Redudancy analysis: an alternative for canonical correlation analysis. Psychometrika, 42:207219, J.P. Van de Geer. Linear relations among k sets of variables. Psychometrika, 49:7094, Vinod. Canonical ridge and econometrics of joint production. Journal of Econometrics, 4:147166, H. Wold. Soft Modeling: The Basic Design and Some Extensions. In in Systems under indi- rect observation, Part 2, K.G. J reskog and H. Wold (Eds), North-Holland, Amsterdam, pages 154, H. Wold. Partial Least Squares. In Encyclopedia of Statistical Sciences, vol. 6, Kotz, S and Johnson, N.L. (Eds), John Wiley and Sons, New York, pages , S. Wold, H. Martens, and H. Wold. The multivariate calibration problem in chemistry solved by the PLS method. In Proc. Conf. Matrix Pencils, Ruhe A. and Kastrom B. (Eds), March 1982, Lecture Notes in Mathematics, Springer Verlag, Heidelberg, pages , 1983.

Structured data analysis with RGCCA. Arthur Tenenhaus 2015/02/13

Structured data analysis with RGCCA. Arthur Tenenhaus 2015/02/13 Structured data analysis with RGCCA Arthur Tenenhaus 2015/02/13 Overview of the presentation Part I. Multi-block analysis Part II. Multi-block and Multi-way analysis p 1 p 2 p 3 p 1 p 2 p 3 n X 1 X...

More information

A criterion-based PLS approach to SEM. Michel Tenenhaus (HEC Paris) Arthur Tenenhaus (SUPELEC)

A criterion-based PLS approach to SEM. Michel Tenenhaus (HEC Paris) Arthur Tenenhaus (SUPELEC) A criterion-based PLS approach to SEM Michel Tenenhaus (HEC Paris) Arthur Tenenhaus (SUPELEC) Economic inequality and political instability Data from Russett (1964), in GIFI Economic inequality Agricultural

More information

complex data Edwin Diday, University i Paris-Dauphine, France CEREMADE, Beijing 2011

complex data Edwin Diday, University i Paris-Dauphine, France CEREMADE, Beijing 2011 Symbolic data analysis of complex data Edwin Diday, CEREMADE, University i Paris-Dauphine, France Beijing 2011 OUTLINE What is the Symbolic Data Analysis (SDA) paradigm? Why SDA is a good tool for Complex

More information

Structured data analysis

Structured data analysis Structured data analysis with RGCCA Arthur Tenenhaus Séminaire de Statistique appliquée du CNAM, 16/03/2018 n p 1 p 2 p J X 1 X... 2 X J p (a) multiblock structure n 1 X 1 n 2 X 2... p 1 p 2 p J n X 1

More information

Canonical Correlation Analysis with Kernels

Canonical Correlation Analysis with Kernels Canonical Correlation Analysis with Kernels Florian Markowetz Max-Planck-Institute for Molecular Genetics Computational Molecular Biology Berlin Computational Diagnostics Group Seminar 2003 Mar 10 1 Overview

More information

Kernel Generalized Canonical Correlation Analysis

Kernel Generalized Canonical Correlation Analysis Kernel Generalized Canonical Correlation Analysis Arthur Tenenhaus To cite this version: Arthur Tenenhaus. Kernel Generalized Canonical Correlation Analysis. JdS 10, May 2010, Marseille, France. CD-ROM

More information

Outline. Motivation. Mapping the input space to the feature space Calculating the dot product in the feature space

Outline. Motivation. Mapping the input space to the feature space Calculating the dot product in the feature space to The The A s s in to Fabio A. González Ph.D. Depto. de Ing. de Sistemas e Industrial Universidad Nacional de Colombia, Bogotá April 2, 2009 to The The A s s in 1 Motivation Outline 2 The Mapping the

More information

Discriminative Direction for Kernel Classifiers

Discriminative Direction for Kernel Classifiers Discriminative Direction for Kernel Classifiers Polina Golland Artificial Intelligence Lab Massachusetts Institute of Technology Cambridge, MA 02139 polina@ai.mit.edu Abstract In many scientific and engineering

More information

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Introduction Edps/Psych/Stat/ 584 Applied Multivariate Statistics Carolyn J Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN c Board of Trustees,

More information

Statistical Machine Learning

Statistical Machine Learning Statistical Machine Learning Christoph Lampert Spring Semester 2015/2016 // Lecture 12 1 / 36 Unsupervised Learning Dimensionality Reduction 2 / 36 Dimensionality Reduction Given: data X = {x 1,..., x

More information

Support Vector Machine Classification via Parameterless Robust Linear Programming

Support Vector Machine Classification via Parameterless Robust Linear Programming Support Vector Machine Classification via Parameterless Robust Linear Programming O. L. Mangasarian Abstract We show that the problem of minimizing the sum of arbitrary-norm real distances to misclassified

More information

Understanding PLS path modeling parameters estimates: a study based on Monte Carlo simulation and customer satisfaction surveys

Understanding PLS path modeling parameters estimates: a study based on Monte Carlo simulation and customer satisfaction surveys Understanding PLS path modeling parameters estimates: a study based on Monte Carlo simulation and customer satisfaction surveys Emmanuel Jakobowicz CEDRIC-CNAM - 292 rue Saint Martin - 75141 Paris Cedex

More information

Modelling and Analysing Interval Data

Modelling and Analysing Interval Data Modelling and Analysing Interval Data Paula Brito Faculdade de Economia/NIAAD-LIACC, Universidade do Porto Rua Dr. Roberto Frias, 4200-464 Porto, Portugal mpbrito@fep.up.pt Abstract. In this paper we discuss

More information

Uncertainty quantification and visualization for functional random variables

Uncertainty quantification and visualization for functional random variables Uncertainty quantification and visualization for functional random variables MascotNum Workshop 2014 S. Nanty 1,3 C. Helbert 2 A. Marrel 1 N. Pérot 1 C. Prieur 3 1 CEA, DEN/DER/SESI/LSMR, F-13108, Saint-Paul-lez-Durance,

More information

Chapter 10 Conjoint Use of Variables Clustering and PLS Structural Equations Modeling

Chapter 10 Conjoint Use of Variables Clustering and PLS Structural Equations Modeling Chapter 10 Conjoint Use of Variables Clustering and PLS Structural Equations Modeling Valentina Stan and Gilbert Saporta Abstract In PLS approach, it is frequently assumed that the blocks of variables

More information

Role and treatment of categorical variables in PLS Path Models for Composite Indicators

Role and treatment of categorical variables in PLS Path Models for Composite Indicators Role and treatment of categorical variables in PLS Path Models for Composite Indicators Laura Trinchera 1,2 & Giorgio Russolillo 2! 1 Dipartimento di Studi sullo Sviluppo Economico, Università degli Studi

More information

SUPPORT VECTOR REGRESSION WITH A GENERALIZED QUADRATIC LOSS

SUPPORT VECTOR REGRESSION WITH A GENERALIZED QUADRATIC LOSS SUPPORT VECTOR REGRESSION WITH A GENERALIZED QUADRATIC LOSS Filippo Portera and Alessandro Sperduti Dipartimento di Matematica Pura ed Applicata Universit a di Padova, Padova, Italy {portera,sperduti}@math.unipd.it

More information

General structural model Part 1: Covariance structure and identification. Psychology 588: Covariance structure and factor models

General structural model Part 1: Covariance structure and identification. Psychology 588: Covariance structure and factor models General structural model Part 1: Covariance structure and identification Psychology 588: Covariance structure and factor models Latent variables 2 Interchangeably used: constructs --- substantively defined

More information

Support'Vector'Machines. Machine(Learning(Spring(2018 March(5(2018 Kasthuri Kannan

Support'Vector'Machines. Machine(Learning(Spring(2018 March(5(2018 Kasthuri Kannan Support'Vector'Machines Machine(Learning(Spring(2018 March(5(2018 Kasthuri Kannan kasthuri.kannan@nyumc.org Overview Support Vector Machines for Classification Linear Discrimination Nonlinear Discrimination

More information

Kernel methods for comparing distributions, measuring dependence

Kernel methods for comparing distributions, measuring dependence Kernel methods for comparing distributions, measuring dependence Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Principal component analysis Given a set of M centered observations

More information

Kernel methods, kernel SVM and ridge regression

Kernel methods, kernel SVM and ridge regression Kernel methods, kernel SVM and ridge regression Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Collaborative Filtering 2 Collaborative Filtering R: rating matrix; U: user factor;

More information

Basics of Multivariate Modelling and Data Analysis

Basics of Multivariate Modelling and Data Analysis Basics of Multivariate Modelling and Data Analysis Kurt-Erik Häggblom 2. Overview of multivariate techniques 2.1 Different approaches to multivariate data analysis 2.2 Classification of multivariate techniques

More information

The Geometry Of Kernel Canonical Correlation Analysis

The Geometry Of Kernel Canonical Correlation Analysis Max Planck Institut für biologische Kybernetik Max Planck Institute for Biological Cybernetics Technical Report No. 108 The Geometry Of Kernel Canonical Correlation Analysis Malte Kuss 1 and Thore Graepel

More information

Nonlinear Dimensionality Reduction

Nonlinear Dimensionality Reduction Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Kernel PCA 2 Isomap 3 Locally Linear Embedding 4 Laplacian Eigenmap

More information

Support Vector Machines Explained

Support Vector Machines Explained December 23, 2008 Support Vector Machines Explained Tristan Fletcher www.cs.ucl.ac.uk/staff/t.fletcher/ Introduction This document has been written in an attempt to make the Support Vector Machines (SVM),

More information

Statistical Data Mining and Machine Learning Hilary Term 2016

Statistical Data Mining and Machine Learning Hilary Term 2016 Statistical Data Mining and Machine Learning Hilary Term 2016 Dino Sejdinovic Department of Statistics Oxford Slides and other materials available at: http://www.stats.ox.ac.uk/~sejdinov/sdmml Naïve Bayes

More information

Conjoint use of variables clustering and PLS structural equations modelling

Conjoint use of variables clustering and PLS structural equations modelling Conjoint use of variables clustering and PLS structural equations modelling Valentina Stan 1 and Gilbert Saporta 1 1 Conservatoire National des Arts et Métiers, 9 Rue Saint Martin, F 75141 Paris Cedex

More information

Learning the Semantic Correlation: An Alternative Way to Gain from Unlabeled Text

Learning the Semantic Correlation: An Alternative Way to Gain from Unlabeled Text Learning the Semantic Correlation: An Alternative Way to Gain from Unlabeled Text Yi Zhang Machine Learning Department Carnegie Mellon University yizhang1@cs.cmu.edu Jeff Schneider The Robotics Institute

More information

Support Vector Machines

Support Vector Machines Support Vector Machines Tobias Pohlen Selected Topics in Human Language Technology and Pattern Recognition February 10, 2014 Human Language Technology and Pattern Recognition Lehrstuhl für Informatik 6

More information

FACTOR ANALYSIS AND MULTIDIMENSIONAL SCALING

FACTOR ANALYSIS AND MULTIDIMENSIONAL SCALING FACTOR ANALYSIS AND MULTIDIMENSIONAL SCALING Vishwanath Mantha Department for Electrical and Computer Engineering Mississippi State University, Mississippi State, MS 39762 mantha@isip.msstate.edu ABSTRACT

More information

Support Vector and Kernel Methods

Support Vector and Kernel Methods SIGIR 2003 Tutorial Support Vector and Kernel Methods Thorsten Joachims Cornell University Computer Science Department tj@cs.cornell.edu http://www.joachims.org 0 Linear Classifiers Rules of the Form:

More information

Experimental Design and Data Analysis for Biologists

Experimental Design and Data Analysis for Biologists Experimental Design and Data Analysis for Biologists Gerry P. Quinn Monash University Michael J. Keough University of Melbourne CAMBRIDGE UNIVERSITY PRESS Contents Preface page xv I I Introduction 1 1.1

More information

On the Equivalence Between Canonical Correlation Analysis and Orthonormalized Partial Least Squares

On the Equivalence Between Canonical Correlation Analysis and Orthonormalized Partial Least Squares On the Equivalence Between Canonical Correlation Analysis and Orthonormalized Partial Least Squares Liang Sun, Shuiwang Ji, Shipeng Yu, Jieping Ye Department of Computer Science and Engineering, Arizona

More information

Introduction to Support Vector Machines

Introduction to Support Vector Machines Introduction to Support Vector Machines Andreas Maletti Technische Universität Dresden Fakultät Informatik June 15, 2006 1 The Problem 2 The Basics 3 The Proposed Solution Learning by Machines Learning

More information

Principal Component Analysis for Interval Data

Principal Component Analysis for Interval Data Outline Paula Brito Fac. Economia & LIAAD-INESC TEC, Universidade do Porto ECI 2015 - Buenos Aires T3: Symbolic Data Analysis: Taking Variability in Data into Account Outline Outline 1 Introduction to

More information

Introduction to Machine Learning

Introduction to Machine Learning 1, DATA11002 Introduction to Machine Learning Lecturer: Teemu Roos TAs: Ville Hyvönen and Janne Leppä-aho Department of Computer Science University of Helsinki (based in part on material by Patrik Hoyer

More information

Penalized varimax. Abstract

Penalized varimax. Abstract Penalized varimax 1 Penalized varimax Nickolay T. Trendafilov and Doyo Gragn Department of Mathematics and Statistics, The Open University, Walton Hall, Milton Keynes MK7 6AA, UK Abstract A common weakness

More information

Support Vector Machines (SVM) in bioinformatics. Day 1: Introduction to SVM

Support Vector Machines (SVM) in bioinformatics. Day 1: Introduction to SVM 1 Support Vector Machines (SVM) in bioinformatics Day 1: Introduction to SVM Jean-Philippe Vert Bioinformatics Center, Kyoto University, Japan Jean-Philippe.Vert@mines.org Human Genome Center, University

More information

MACHINE LEARNING. Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA

MACHINE LEARNING. Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA 1 MACHINE LEARNING Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA 2 Practicals Next Week Next Week, Practical Session on Computer Takes Place in Room GR

More information

Uncertainties estimation and compensation on a robotic manipulator. Application on Barrett arm

Uncertainties estimation and compensation on a robotic manipulator. Application on Barrett arm Uncertainties estimation and compensation on a robotic manipulator. Application on Barrett arm Duong Dang December, 8 Abstract In this project, two different regression methods have been used to estimate

More information

EXTENDING PARTIAL LEAST SQUARES REGRESSION

EXTENDING PARTIAL LEAST SQUARES REGRESSION EXTENDING PARTIAL LEAST SQUARES REGRESSION ATHANASSIOS KONDYLIS UNIVERSITY OF NEUCHÂTEL 1 Outline Multivariate Calibration in Chemometrics PLS regression (PLSR) and the PLS1 algorithm PLS1 from a statistical

More information

EXPLICIT EXPRESSIONS OF PROJECTORS ON CANONICAL VARIABLES AND DISTANCES BETWEEN CENTROIDS OF GROUPS. Haruo Yanai*

EXPLICIT EXPRESSIONS OF PROJECTORS ON CANONICAL VARIABLES AND DISTANCES BETWEEN CENTROIDS OF GROUPS. Haruo Yanai* J. Japan Statist. Soc. Vol. 11 No. 1 1981 43-53 EXPLICIT EXPRESSIONS OF PROJECTORS ON CANONICAL VARIABLES AND DISTANCES BETWEEN CENTROIDS OF GROUPS Haruo Yanai* Generalized expressions of canonical correlation

More information

On Expected Gaussian Random Determinants

On Expected Gaussian Random Determinants On Expected Gaussian Random Determinants Moo K. Chung 1 Department of Statistics University of Wisconsin-Madison 1210 West Dayton St. Madison, WI 53706 Abstract The expectation of random determinants whose

More information

Final Overview. Introduction to ML. Marek Petrik 4/25/2017

Final Overview. Introduction to ML. Marek Petrik 4/25/2017 Final Overview Introduction to ML Marek Petrik 4/25/2017 This Course: Introduction to Machine Learning Build a foundation for practice and research in ML Basic machine learning concepts: max likelihood,

More information

Support Vector Machine & Its Applications

Support Vector Machine & Its Applications Support Vector Machine & Its Applications A portion (1/3) of the slides are taken from Prof. Andrew Moore s SVM tutorial at http://www.cs.cmu.edu/~awm/tutorials Mingyue Tan The University of British Columbia

More information

Application Note. The Optimization of Injection Molding Processes Using Design of Experiments

Application Note. The Optimization of Injection Molding Processes Using Design of Experiments The Optimization of Injection Molding Processes Using Design of Experiments Problem Manufacturers have three primary goals: 1) produce goods that meet customer specifications; 2) improve process efficiency

More information

Support Vector Machine (SVM) and Kernel Methods

Support Vector Machine (SVM) and Kernel Methods Support Vector Machine (SVM) and Kernel Methods CE-717: Machine Learning Sharif University of Technology Fall 2014 Soleymani Outline Margin concept Hard-Margin SVM Soft-Margin SVM Dual Problems of Hard-Margin

More information

Learning with kernels and SVM

Learning with kernels and SVM Learning with kernels and SVM Šámalova chata, 23. května, 2006 Petra Kudová Outline Introduction Binary classification Learning with Kernels Support Vector Machines Demo Conclusion Learning from data find

More information

Evaluating the Sensitivity of Goodness-of-Fit Indices to Data Perturbation: An Integrated MC-SGR Approach

Evaluating the Sensitivity of Goodness-of-Fit Indices to Data Perturbation: An Integrated MC-SGR Approach Evaluating the Sensitivity of Goodness-of-Fit Indices to Data Perturbation: An Integrated MC-SGR Approach Massimiliano Pastore 1 and Luigi Lombardi 2 1 Department of Psychology University of Cagliari Via

More information

Clustering. CSL465/603 - Fall 2016 Narayanan C Krishnan

Clustering. CSL465/603 - Fall 2016 Narayanan C Krishnan Clustering CSL465/603 - Fall 2016 Narayanan C Krishnan ckn@iitrpr.ac.in Supervised vs Unsupervised Learning Supervised learning Given x ", y " "%& ', learn a function f: X Y Categorical output classification

More information

Brief Introduction of Machine Learning Techniques for Content Analysis

Brief Introduction of Machine Learning Techniques for Content Analysis 1 Brief Introduction of Machine Learning Techniques for Content Analysis Wei-Ta Chu 2008/11/20 Outline 2 Overview Gaussian Mixture Model (GMM) Hidden Markov Model (HMM) Support Vector Machine (SVM) Overview

More information

RV Coefficient and Congruence Coefficient

RV Coefficient and Congruence Coefficient RV Coefficient and Congruence Coefficient Hervé Abdi 1 1 Overview The congruence coefficient was first introduced by Burt (1948) under the name of unadjusted correlation as a measure of the similarity

More information

ESANN'2001 proceedings - European Symposium on Artificial Neural Networks Bruges (Belgium), April 2001, D-Facto public., ISBN ,

ESANN'2001 proceedings - European Symposium on Artificial Neural Networks Bruges (Belgium), April 2001, D-Facto public., ISBN , Sparse Kernel Canonical Correlation Analysis Lili Tan and Colin Fyfe 2, Λ. Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong. 2. School of Information and Communication

More information

Neural Networks. Prof. Dr. Rudolf Kruse. Computational Intelligence Group Faculty for Computer Science

Neural Networks. Prof. Dr. Rudolf Kruse. Computational Intelligence Group Faculty for Computer Science Neural Networks Prof. Dr. Rudolf Kruse Computational Intelligence Group Faculty for Computer Science kruse@iws.cs.uni-magdeburg.de Rudolf Kruse Neural Networks 1 Supervised Learning / Support Vector Machines

More information

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013 UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013 Exam policy: This exam allows two one-page, two-sided cheat sheets; No other materials. Time: 2 hours. Be sure to write your name and

More information

Regression. Goal: Learn a mapping from observations (features) to continuous labels given a training set (supervised learning)

Regression. Goal: Learn a mapping from observations (features) to continuous labels given a training set (supervised learning) Linear Regression Regression Goal: Learn a mapping from observations (features) to continuous labels given a training set (supervised learning) Example: Height, Gender, Weight Shoe Size Audio features

More information

Machine learning for pervasive systems Classification in high-dimensional spaces

Machine learning for pervasive systems Classification in high-dimensional spaces Machine learning for pervasive systems Classification in high-dimensional spaces Department of Communications and Networking Aalto University, School of Electrical Engineering stephan.sigg@aalto.fi Version

More information

Regression. Goal: Learn a mapping from observations (features) to continuous labels given a training set (supervised learning)

Regression. Goal: Learn a mapping from observations (features) to continuous labels given a training set (supervised learning) Linear Regression Regression Goal: Learn a mapping from observations (features) to continuous labels given a training set (supervised learning) Example: Height, Gender, Weight Shoe Size Audio features

More information

A New Perspective on Boosting in Linear Regression via Subgradient Optimization and Relatives

A New Perspective on Boosting in Linear Regression via Subgradient Optimization and Relatives A New Perspective on Boosting in Linear Regression via Subgradient Optimization and Relatives Paul Grigas May 25, 2016 1 Boosting Algorithms in Linear Regression Boosting [6, 9, 12, 15, 16] is an extremely

More information

Support Vector Machine (SVM) and Kernel Methods

Support Vector Machine (SVM) and Kernel Methods Support Vector Machine (SVM) and Kernel Methods CE-717: Machine Learning Sharif University of Technology Fall 2015 Soleymani Outline Margin concept Hard-Margin SVM Soft-Margin SVM Dual Problems of Hard-Margin

More information

MARGINAL HOMOGENEITY MODEL FOR ORDERED CATEGORIES WITH OPEN ENDS IN SQUARE CONTINGENCY TABLES

MARGINAL HOMOGENEITY MODEL FOR ORDERED CATEGORIES WITH OPEN ENDS IN SQUARE CONTINGENCY TABLES REVSTAT Statistical Journal Volume 13, Number 3, November 2015, 233 243 MARGINAL HOMOGENEITY MODEL FOR ORDERED CATEGORIES WITH OPEN ENDS IN SQUARE CONTINGENCY TABLES Authors: Serpil Aktas Department of

More information

Data Mining Techniques

Data Mining Techniques Data Mining Techniques CS 6220 - Section 3 - Fall 2016 Lecture 12 Jan-Willem van de Meent (credit: Yijun Zhao, Percy Liang) DIMENSIONALITY REDUCTION Borrowing from: Percy Liang (Stanford) Linear Dimensionality

More information

A new linear regression model for histogram-valued variables

A new linear regression model for histogram-valued variables Int. Statistical Inst.: Proc. 58th World Statistical Congress, 011, Dublin (Session CPS077) p.5853 A new linear regression model for histogram-valued variables Dias, Sónia Instituto Politécnico Viana do

More information

Computational Linear Algebra

Computational Linear Algebra Computational Linear Algebra PD Dr. rer. nat. habil. Ralf-Peter Mundani Computation in Engineering / BGU Scientific Computing in Computer Science / INF Winter Term 2018/19 Part 6: Some Other Stuff PD Dr.

More information

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008 Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:

More information

Kazuhiro Fukui, University of Tsukuba

Kazuhiro Fukui, University of Tsukuba Subspace Methods Kazuhiro Fukui, University of Tsukuba Synonyms Multiple similarity method Related Concepts Principal component analysis (PCA) Subspace analysis Dimensionality reduction Definition Subspace

More information

International Journal of Pure and Applied Mathematics Volume 19 No , A NOTE ON BETWEEN-GROUP PCA

International Journal of Pure and Applied Mathematics Volume 19 No , A NOTE ON BETWEEN-GROUP PCA International Journal of Pure and Applied Mathematics Volume 19 No. 3 2005, 359-366 A NOTE ON BETWEEN-GROUP PCA Anne-Laure Boulesteix Department of Statistics University of Munich Akademiestrasse 1, Munich,

More information

Shrinkage regression

Shrinkage regression Shrinkage regression Rolf Sundberg Volume 4, pp 1994 1998 in Encyclopedia of Environmetrics (ISBN 0471 899976) Edited by Abdel H. El-Shaarawi and Walter W. Piegorsch John Wiley & Sons, Ltd, Chichester,

More information

A Squared Correlation Coefficient of the Correlation Matrix

A Squared Correlation Coefficient of the Correlation Matrix A Squared Correlation Coefficient of the Correlation Matrix Rong Fan Southern Illinois University August 25, 2016 Abstract Multivariate linear correlation analysis is important in statistical analysis

More information

PACKAGE LMest FOR LATENT MARKOV ANALYSIS

PACKAGE LMest FOR LATENT MARKOV ANALYSIS PACKAGE LMest FOR LATENT MARKOV ANALYSIS OF LONGITUDINAL CATEGORICAL DATA Francesco Bartolucci 1, Silvia Pandofi 1, and Fulvia Pennoni 2 1 Department of Economics, University of Perugia (e-mail: francesco.bartolucci@unipg.it,

More information

Support Vector Machine

Support Vector Machine Support Vector Machine Fabrice Rossi SAMM Université Paris 1 Panthéon Sorbonne 2018 Outline Linear Support Vector Machine Kernelized SVM Kernels 2 From ERM to RLM Empirical Risk Minimization in the binary

More information

Estimation of self supporting towers natural frequency using Support vector machine

Estimation of self supporting towers natural frequency using Support vector machine International Research Journal of Applied and Basic Sciences 013 Available online at www.irjabs.com ISSN 51-838X / Vol, 4 (8): 350-356 Science Explorer Publications Estimation of self supporting towers

More information

Analysis of Multiclass Support Vector Machines

Analysis of Multiclass Support Vector Machines Analysis of Multiclass Support Vector Machines Shigeo Abe Graduate School of Science and Technology Kobe University Kobe, Japan abe@eedept.kobe-u.ac.jp Abstract Since support vector machines for pattern

More information

Machine Learning. CUNY Graduate Center, Spring Lectures 11-12: Unsupervised Learning 1. Professor Liang Huang.

Machine Learning. CUNY Graduate Center, Spring Lectures 11-12: Unsupervised Learning 1. Professor Liang Huang. Machine Learning CUNY Graduate Center, Spring 2013 Lectures 11-12: Unsupervised Learning 1 (Clustering: k-means, EM, mixture models) Professor Liang Huang huang@cs.qc.cuny.edu http://acl.cs.qc.edu/~lhuang/teaching/machine-learning

More information

Unsupervised dimensionality reduction

Unsupervised dimensionality reduction Unsupervised dimensionality reduction Guillaume Obozinski Ecole des Ponts - ParisTech SOCN course 2014 Guillaume Obozinski Unsupervised dimensionality reduction 1/30 Outline 1 PCA 2 Kernel PCA 3 Multidimensional

More information

THE OBJECTIVE FUNCTION OF PARTIAL LEAST SQUARES REGRESSION

THE OBJECTIVE FUNCTION OF PARTIAL LEAST SQUARES REGRESSION THE OBJECTIVE FUNCTION OF PARTIAL LEAST SQUARES REGRESSION CAJO J. F. TER BRAAK Centre for Biometry Wageningen, P.O. Box 1, NL-67 AC Wageningen, The Netherlands AND SIJMEN DE JONG Unilever Research Laboratorium,

More information

Multiple Similarities Based Kernel Subspace Learning for Image Classification

Multiple Similarities Based Kernel Subspace Learning for Image Classification Multiple Similarities Based Kernel Subspace Learning for Image Classification Wang Yan, Qingshan Liu, Hanqing Lu, and Songde Ma National Laboratory of Pattern Recognition, Institute of Automation, Chinese

More information

Hard and Fuzzy c-medoids for Asymmetric Networks

Hard and Fuzzy c-medoids for Asymmetric Networks 16th World Congress of the International Fuzzy Systems Association (IFSA) 9th Conference of the European Society for Fuzzy Logic and Technology (EUSFLAT) Hard and Fuzzy c-medoids for Asymmetric Networks

More information

Computer Vision Group Prof. Daniel Cremers. 2. Regression (cont.)

Computer Vision Group Prof. Daniel Cremers. 2. Regression (cont.) Prof. Daniel Cremers 2. Regression (cont.) Regression with MLE (Rep.) Assume that y is affected by Gaussian noise : t = f(x, w)+ where Thus, we have p(t x, w, )=N (t; f(x, w), 2 ) 2 Maximum A-Posteriori

More information

Principal Component Analysis, A Powerful Scoring Technique

Principal Component Analysis, A Powerful Scoring Technique Principal Component Analysis, A Powerful Scoring Technique George C. J. Fernandez, University of Nevada - Reno, Reno NV 89557 ABSTRACT Data mining is a collection of analytical techniques to uncover new

More information

Support Vector Machine (SVM) and Kernel Methods

Support Vector Machine (SVM) and Kernel Methods Support Vector Machine (SVM) and Kernel Methods CE-717: Machine Learning Sharif University of Technology Fall 2016 Soleymani Outline Margin concept Hard-Margin SVM Soft-Margin SVM Dual Problems of Hard-Margin

More information

Brief Introduction to Machine Learning

Brief Introduction to Machine Learning Brief Introduction to Machine Learning Yuh-Jye Lee Lab of Data Science and Machine Intelligence Dept. of Applied Math. at NCTU August 29, 2016 1 / 49 1 Introduction 2 Binary Classification 3 Support Vector

More information

Clusterwise analysis for multiblock component methods

Clusterwise analysis for multiblock component methods In press in: Advances in Data Analysis and Classification, 12. (2018) Clusterwise analysis for multiblock component methods Stéphanie Bougeard Hervé Abdi Gilbert Saporta Ndèye Niang Accepted: October 15,

More information

Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood

Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood Jonathan Gruhl March 18, 2010 1 Introduction Researchers commonly apply item response theory (IRT) models to binary and ordinal

More information

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Canonical Edps/Soc 584 and Psych 594 Applied Multivariate Statistics Carolyn J. Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Canonical Slide

More information

Multicategory Vertex Discriminant Analysis for High-Dimensional Data

Multicategory Vertex Discriminant Analysis for High-Dimensional Data Multicategory Vertex Discriminant Analysis for High-Dimensional Data Tong Tong Wu Department of Epidemiology and Biostatistics University of Maryland, College Park October 8, 00 Joint work with Prof. Kenneth

More information

Pairwise Away Steps for the Frank-Wolfe Algorithm

Pairwise Away Steps for the Frank-Wolfe Algorithm Pairwise Away Steps for the Frank-Wolfe Algorithm Héctor Allende Department of Informatics Universidad Federico Santa María, Chile hallende@inf.utfsm.cl Ricardo Ñanculef Department of Informatics Universidad

More information

Machine Learning (BSMC-GA 4439) Wenke Liu

Machine Learning (BSMC-GA 4439) Wenke Liu Machine Learning (BSMC-GA 4439) Wenke Liu 02-01-2018 Biomedical data are usually high-dimensional Number of samples (n) is relatively small whereas number of features (p) can be large Sometimes p>>n Problems

More information

Notes on Plücker s Relations in Geometric Algebra

Notes on Plücker s Relations in Geometric Algebra Notes on Plücker s Relations in Geometric Algebra Garret Sobczyk Universidad de las Américas-Puebla Departamento de Físico-Matemáticas 72820 Puebla, Pue., México Email: garretudla@gmail.com January 21,

More information

Discriminant Analysis for Interval Data

Discriminant Analysis for Interval Data Outline Discriminant Analysis for Interval Data Paula Brito Fac. Economia & LIAAD-INESC TEC, Universidade do Porto ECI 2015 - Buenos Aires T3: Symbolic Data Analysis: Taking Variability in Data into Account

More information

Multivariate Statistics Summary and Comparison of Techniques. Multivariate Techniques

Multivariate Statistics Summary and Comparison of Techniques. Multivariate Techniques Multivariate Statistics Summary and Comparison of Techniques P The key to multivariate statistics is understanding conceptually the relationship among techniques with regards to: < The kinds of problems

More information

Analysing geoadditive regression data: a mixed model approach

Analysing geoadditive regression data: a mixed model approach Analysing geoadditive regression data: a mixed model approach Institut für Statistik, Ludwig-Maximilians-Universität München Joint work with Ludwig Fahrmeir & Stefan Lang 25.11.2005 Spatio-temporal regression

More information

Forecasting 1 to h steps ahead using partial least squares

Forecasting 1 to h steps ahead using partial least squares Forecasting 1 to h steps ahead using partial least squares Philip Hans Franses Econometric Institute, Erasmus University Rotterdam November 10, 2006 Econometric Institute Report 2006-47 I thank Dick van

More information

Statistical Machine Learning from Data

Statistical Machine Learning from Data Samy Bengio Statistical Machine Learning from Data 1 Statistical Machine Learning from Data Support Vector Machines Samy Bengio IDIAP Research Institute, Martigny, Switzerland, and Ecole Polytechnique

More information

Factor Analysis (10/2/13)

Factor Analysis (10/2/13) STA561: Probabilistic machine learning Factor Analysis (10/2/13) Lecturer: Barbara Engelhardt Scribes: Li Zhu, Fan Li, Ni Guan Factor Analysis Factor analysis is related to the mixture models we have studied.

More information

SPATIAL-TEMPORAL TECHNIQUES FOR PREDICTION AND COMPRESSION OF SOIL FERTILITY DATA

SPATIAL-TEMPORAL TECHNIQUES FOR PREDICTION AND COMPRESSION OF SOIL FERTILITY DATA SPATIAL-TEMPORAL TECHNIQUES FOR PREDICTION AND COMPRESSION OF SOIL FERTILITY DATA D. Pokrajac Center for Information Science and Technology Temple University Philadelphia, Pennsylvania A. Lazarevic Computer

More information

MACHINE LEARNING. Support Vector Machines. Alessandro Moschitti

MACHINE LEARNING. Support Vector Machines. Alessandro Moschitti MACHINE LEARNING Support Vector Machines Alessandro Moschitti Department of information and communication technology University of Trento Email: moschitti@dit.unitn.it Summary Support Vector Machines

More information

Least Absolute Shrinkage is Equivalent to Quadratic Penalization

Least Absolute Shrinkage is Equivalent to Quadratic Penalization Least Absolute Shrinkage is Equivalent to Quadratic Penalization Yves Grandvalet Heudiasyc, UMR CNRS 6599, Université de Technologie de Compiègne, BP 20.529, 60205 Compiègne Cedex, France Yves.Grandvalet@hds.utc.fr

More information

22 : Hilbert Space Embeddings of Distributions

22 : Hilbert Space Embeddings of Distributions 10-708: Probabilistic Graphical Models 10-708, Spring 2014 22 : Hilbert Space Embeddings of Distributions Lecturer: Eric P. Xing Scribes: Sujay Kumar Jauhar and Zhiguang Huo 1 Introduction and Motivation

More information

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014 UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014 Exam policy: This exam allows two one-page, two-sided cheat sheets (i.e. 4 sides); No other materials. Time: 2 hours. Be sure to write

More information