The CCA Package. March 20, 2007

Similar documents
Package seedcca. R topics documented: August 30, 2017

Lecture 4: Principal Component Analysis and Linear Dimension Reduction

Package yacca. September 11, 2018

Package SEMModComp. R topics documented: February 19, Type Package Title Model Comparisons for SEM Version 1.0 Date Author Roy Levy

arxiv: v2 [stat.me] 12 Jul 2018

Package ShrinkCovMat

Package covrna. R topics documented: September 7, Type Package

Package gma. September 19, 2017

Package SPreFuGED. July 29, 2016

Package AID. R topics documented: November 10, Type Package Title Box-Cox Power Transformation Version 2.3 Date Depends R (>= 2.15.

Extensions of CCA and PLS to unravel relationships between two data sets

FINM 331: MULTIVARIATE DATA ANALYSIS FALL 2017 PROBLEM SET 3

Package jmcm. November 25, 2017

The SpatialNP Package

Package SpatialNP. June 5, 2018

Package varband. November 7, 2016

Package hot.deck. January 4, 2016

Package CCP. R topics documented: December 17, Type Package. Title Significance Tests for Canonical Correlation Analysis (CCA) Version 0.

Package LBLGXE. R topics documented: July 20, Type Package

Package dama. November 25, 2017

Kernel methods for comparing distributions, measuring dependence

Robust Maximum Association Between Data Sets: The R Package ccapp

Package aspi. R topics documented: September 20, 2016

Package scio. February 20, 2015

Package gtheory. October 30, 2016

Package noncompliance

Package breakaway. R topics documented: March 30, 2016

Package ForwardSearch

The sbgcop Package. March 9, 2007

Package MACAU2. R topics documented: April 8, Type Package. Title MACAU 2.0: Efficient Mixed Model Analysis of Count Data. Version 1.

Package mctest. February 26, 2018

Package rnmf. February 20, 2015

Package lmm. R topics documented: March 19, Version 0.4. Date Title Linear mixed models. Author Joseph L. Schafer

Package mlmmm. February 20, 2015

Package leiv. R topics documented: February 20, Version Type Package

Package sscor. January 28, 2016

Package LPTime. March 3, 2015

Package dhsic. R topics documented: July 27, 2017

CS 340 Lec. 18: Multivariate Gaussian Distributions and Linear Discriminant Analysis

Package depth.plot. December 20, 2015

Package EnergyOnlineCPM

Package msir. R topics documented: April 7, Type Package Version Date Title Model-Based Sliced Inverse Regression

Package EMVS. April 24, 2018

Package clogitboost. R topics documented: December 21, 2015

Package SPADAR. April 30, 2017

Package sbgcop. May 29, 2018

Package Blendstat. February 21, 2018

Computational Biology Course Descriptions 12-14

The lmm Package. May 9, Description Some improved procedures for linear mixed models

Package hierdiversity

The evdbayes Package

Package genphen. March 18, 2019

Appendix A : rational of the spatial Principal Component Analysis

Package OUwie. August 29, 2013

Canonical Correlation Analysis

Package Rarity. December 23, 2016

Package convospat. November 3, 2017

Package TwoStepCLogit

Package lomb. February 20, 2015

Package RRR. April 12, 2013

Package ICBayes. September 24, 2017

Package drgee. November 8, 2016

Package CoxRidge. February 27, 2015

The nfactors Package

Package HarmonicRegression

Package rela. R topics documented: February 20, Version 4.1 Date Title Item Analysis Package with Standard Errors

Canonical Correlation Analysis with Kernels

1. What is the determinant of the following matrix? a 1 a 2 4a 3 2a 2 b 1 b 2 4b 3 2b c 1. = 4, then det

Package SK. May 16, 2018

Package LCAextend. August 29, 2013

Package FinCovRegularization

Package face. January 19, 2018

General Linear Model. Notes Output Created Comments Input. 19-Dec :09:44

Package dhga. September 6, 2016

II) ACTIVITY: Discovering the periodic trend for atomic radius within the groups of elements.

Package robustrao. August 14, 2017

Package EMMIXmcfa. October 18, 2017

Inverse of a Square Matrix. For an N N square matrix A, the inverse of A, 1

Package TSPred. April 5, 2017

Package mpmcorrelogram

Package pfa. July 4, 2016

STAT 501 EXAM I NAME Spring 1999

Package bigtime. November 9, 2017

Package ResistorArray

Package rgabriel. February 20, 2015

Package EMMIXmfa. October 18, Index 15

Package invgamma. May 7, 2017

Package pcia. March 5, 2018

Multivariate Statistics Fundamentals Part 1: Rotation-based Techniques

Package QuantifQuantile

Package BNSL. R topics documented: September 18, 2017

Package beam. May 3, 2018

Package mmm. R topics documented: February 20, Type Package

Package sgpca. R topics documented: July 6, Type Package. Title Sparse Generalized Principal Component Analysis. Version 1.0.

Package hiddenf. R topics documented: January 7, Type Package. Title Computes the F-test for hidden additivity in two-factor experiments..

Package vhica. April 5, 2016

M. H. Gonçalves 1 M. S. Cabral 2 A. Azzalini 3

Package mirlastic. November 12, 2015

The cramer Package. June 19, cramer.test Index 5

Package tsvd. March 11, 2015

Transcription:

The CCA Package March 20, 2007 Type Package Title Canonical correlation analysis Version 1.1 Date 2007-03-20 Author Ignacio González, Sébastien Déjean Maintainer Sébastien Déjean <sebastien.dejean@math.ups-tlse.fr> The package provide a set of functions that extend the cancor function with new numerical and graphical outputs. It also include a regularized extension of the cannonical correlation analysis to deal with datasets with more variables than observations. Depends fda, fields License GPL URL www.lsp.ups-tlse.fr/cca R topics documented: CCA-package........................................ 2 cc.............................................. 2 comput........................................... 4 estim.regul......................................... 5 img.estim.regul....................................... 6 img.matcor......................................... 6 loo.............................................. 7 matcor............................................ 8 nutrimouse......................................... 9 plt.cc............................................. 10 plt.indiv........................................... 11 plt.var............................................ 12 rcc.............................................. 12 Index 14 1

2 cc CCA-package Canonical correlation analysis The package provides a set of functions that extend the cancor() function with new numerical and graphical outputs. It includes a regularized extension of the canonical correlation analysis to deal with datasets with more variables than observations and enables to handle with missing values. Package: CCA Type: Package Version: 1.0 Date: 2007-02-02 License: None Ignacio González, Sébastien Déjean Maintainer: Sébastien Déjean <sebastien.dejean@math.ups-tlse.fr> References www.lsp.ups-tlse.fr/cca cc Canonical Correlation Analysis The function performs Canonical Correlation Analysis to highlight correlations between two data matrices. It complete the cancor() function with supplemental numerical and graphical outputs and can handle missing values. cc(x, Y) X Y numeric matrix (n * p), containing the X coordinates. numeric matrix (n * q), containing the Y coordinates.

cc 3 The canonical correlation analysis seeks linear combinations of the X variables which are the most correlated with linear combinations of the Y variables. Let PX and PY be the projector onto the respective column-space of X and Y. The eigenanalysis of PXPY provide the canonical correlations (square roots of the eigenvalues) and the coefficients of linear combinations that define the canonical variates (eigen vectors). Value A list containing the following components: cor names xcoef ycoef scores canonical correlations a list containing the names to be used for individuals and variables for graphical outputs estimated coefficients for the X variables as returned by cancor() estimated coefficients for the Y variables as returned by cancor() a list returned by the internal function comput() containing individuals and variables coordinates on the canonical variates basis. References www.lsp.ups-tlse.fr/cca rcc, plt.cc data(nutrimouse) X=as.matrix(nutrimouse$gene[,1:10]) Y=as.matrix(nutrimouse$lipid) res.cc=cc(x,y) plot(res.cc$cor,type="b") plt.cc(res.cc)

4 comput comput Additional computations for CCA The comput() function can be viewed as an internal function. It is called by cc() and rcc to perform additional computations. The user does not have to call it by himself. comput(x, Y, res) X Y res numeric matrix (n * p), containing the X coordinates. numeric matrix (n * q), containing the Y coordinates. results provided by the cc() and rcc() functions. Value A list containing the following components: xscores X canonical variates yscores Y canonical variates corr.x.xscores Correlation bewteen X and X canonical variates corr.y.xscores Correlation bewteen Y and X canonical variates corr.x.yscores Correlation bewteen X and Y canonical variates corr.y.yscores Correlation bewteen Y and Y canonical variates cc, rcc

estim.regul 5 estim.regul Estimate the parameters of regularization Calulate the leave-one-out criterion on a 2D-grid to determine optimal values for the parameters of regularization. estim.regul(x, Y, grid1 = NULL, grid2 = NULL, plt = TRUE) X Y grid1 grid2 plt numeric matrix (n * p), containing the X coordinates. numeric matrix (n * p), containing the X coordinates. vector defining the values of lambda1 to be tested. If NULL, the vector is defined as seq(0.001, 1, length = 5) vector defining the values of lambda2 to be tested. If NULL, the vector is defined as seq(0.001, 1, length = 5) logical argument indicating whether an image should be plotted by calling the img.estim.regul() function. Value A 3-vector containing the 2 values of the parameters of regularization on which the leave-one-out criterion reached its maximum; and the maximal value reached on the grid. loo #data(nutrimouse) #X=as.matrix(nutrimouse$gene) #Y=as.matrix(nutrimouse$lipid) #res.regul = estim.regul(x,y,c(0.01,0.5),c(0.1,0.2,0.3))

6 img.matcor img.estim.regul Plot the cross-validation criterion This function provide a visualization of the values of the cross-validation criterion obtained on a grid defined in the function estim.regul(). img.estim.regul(estim) estim Object returned by estim.regul(). estim.regul img.matcor Image of correlation matrices Display images of the correlation matrices within and between two data matrices. img.matcor(correl, type = 1)

loo 7 correl Correlation matrices as returned by the matcor() function type character determining the kind of plots to be produced: either one ((p+q) * (p+q)) matrix or three matrices (p * p), (q * q) and (p * q) Matrices are pre-processed before calling the image() function in order to get, as in the numerical representation, the diagonal from upper-left corner to bottom-right one. matcor data(nutrimouse) X=as.matrix(nutrimouse$gene) Y=as.matrix(nutrimouse$lipid) correl=matcor(x,y) img.matcor(correl) img.matcor(correl,type=2) loo Leave-one-out criterion The loo() function can be viewed as an internal function. It is called by estim.regul() to obtain optimal values for the two parameters of regularization. loo(x, Y, lambda1, lambda2) X Y lambda1 lambda2 numeric matrix (n * p), containing the X coordinates. numeric matrix (n * q), containing the Y coordinates. parameter of regularization for X variables parameter of regularization for Y variables

8 matcor estim.regul matcor Correlations matrices The function computes the correlation matrices within and between two datasets. matcor(x, Y) X Y numeric matrix (n * p), containing the X coordinates. numeric matrix (n * q), containing the Y coordinates. Value A list containing the following components: Xcor Ycor XYcor Correlation matrix (p * p) for the X variables Correlation matrix (q * q) for the Y variables Correlation matrix ((p+q) * (p+q)) between X and Y variables img.matcor

nutrimouse 9 data(nutrimouse) X=as.matrix(nutrimouse$gene) Y=as.matrix(nutrimouse$lipid) correl=matcor(x,y) img.matcor(correl) img.matcor(correl,type=2) nutrimouse Nutrimouse dataset The nutrimouse dataset comes from a nutrition study in the mouse. It was provided by Pascal Martin from the Toxicology and Pharmacology Laboratory (French National Institute for Agronomic Research). data(nutrimouse) Format A list containing the following components: gene: data frame (40 * 120) with numerical variables lipid: data frame (40 * 21) with numerical variables diet: factor vector (40) genotype: factor vector (40) Two sets of variables were measured on 40 mice: expressions of 120 genes potentially involved in nutritional problems. concentrations of 21 hepatic fatty acids. The 40 mice were distributed in a 2-factors experimental design (4 replicates): Genotype (2-levels factor): wild-type and PPARalpha -/- Diet (5-levels factor): Oils used for experimental diets preparation were corn and colza oils (50/50) for a reference diet (REF), hydrogenated coconut oil for a saturated fatty acid diet (COC), sunflower oil for an Omega6 fatty acid-rich diet (SUN), linseed oil for an Omega3-rich diet (LIN) and corn/colza/enriched fish oils for the FISH diet (43/43/14).

10 plt.cc Source P. Martin, H. Guillou, F. Lasserre, S. Déjean, A. Lan, J-M. Pascussi, M. San Cristobal, P. Legrand, P. Besse, T. Pineau - Novel aspects of PPARalpha-mediated regulation of lipid and xenobiotic metabolism revealed through a nutrigenomic study. Hepatology, in press, 2007. References www.inra.fr/internet/centres/toulouse/pharmacologie/pharmaco-moleculaire/acceuil.html data(nutrimouse) boxplot(nutrimouse$lipid) plt.cc Graphical outputs for canonical correlation analysis This function calls either plt.var() or plt.indiv() or both functions to provide individual and/or variable representation on the canonical variates. plt.cc(res, d1 = 1, d2 = 2, int = 0.5, type = "b", ind.names = NULL, var.label = FALSE, Xnames = NULL, Ynames = NULL) res d1 d2 int type ind.names var.label Xnames Ynames Object returned by cc() or rcc() The dimension that will be represented on the horizontal axis The dimension that will be represented on the vertical axis The radius of the inner circle Character "v" (variables), "i" (individuals) or "b" (both) to specifying the plot to be done. vector containing the names of the individuals logical indicating whether label should be plotted on the variables representation vector giving the names of X variables vector giving the names of Y variables

plt.indiv 11 References www.lsp.ups-tlse.fr/biopuces/cca plt.indiv, plt.var data(nutrimouse) X=as.matrix(nutrimouse$gene[,1:10]) Y=as.matrix(nutrimouse$lipid) res.cc=cc(x,y) plt.cc(res.cc) plt.cc(res.cc,d1=1,d2=3,type="v",var.label=true) plt.indiv Individuals representation for CCA This function provides individuals representation on the canonical variates. plt.indiv(res, d1, d2, ind.names = NULL) res d1 d2 ind.names Object returned by cc() or rcc() The dimension that will be represented on the horizontal axis The dimension that will be represented on the vertical axis vector containing the names of the individuals References www.lsp.ups-tlse.fr/biopuces/cca plt.var, plt.cc

12 rcc plt.var Variables representation for CCA This function provides variables representation on the canonical variates. plt.var(res, d1, d2, int = 0.5, var.label = FALSE, Xnames = NULL, Ynames = NULL) res d1 d2 int var.label Xnames Ynames Object returned by cc or rcc The dimension that will be represented on the horizontal axis The dimension that will be represented on the vertical axis The radius of the inner circle logical indicating whether label should be plotted on the variables representation vector giving the names of X variables vector giving the names of Y variables References www.lsp.ups-tlse.fr/biopuces/cca plt.indiv, plt.cc rcc Regularized Canonical Correlation Analysis The function performs the Regularized extension of the Canonical Correlation Analysis to seek correlations between two data matrices when the number of columns (variables) exceeds the number of rows (observations) rcc(x, Y, lambda1, lambda2)

rcc 13 X Y lambda1 lambda2 numeric matrix (n * p), containing the X coordinates. numeric matrix (n * q), containing the Y coordinates. Regularization parameter for X Regularization parameter for Y Value When the number of columns is greater than the number of rows, the matrice X X (and/or Y Y) may be ill-conditioned. The regularization allows the inversion by adding a term on the diagonal. A list containing the following components: corr names xcoef ycoef scores canonical correlations a list containing the names to be used for individuals and variables for graphical outputs estimated coefficients for the X variables as returned by cancor() estimated coefficients for the Y variables as returned by cancor() a list returned by the internal function comput() containing individuals and variables coordinates on the canonical variates basis. References Leurgans, Moyeed and Silverman, (1993). Canonical correlation analysis when the data are curves. J. Roy. Statist. Soc. Ser. B. 55, 725-740. Vinod (1976). Canonical ridge and econometrics of joint production. J. Econometr. 6, 129-137. cc, estim.regul, plt.cc data(nutrimouse) X=as.matrix(nutrimouse$gene) Y=as.matrix(nutrimouse$lipid) res.cc=rcc(x,y,0.1,0.2) plt.cc(res.cc)

Index Topic datasets nutrimouse, 9 Topic dplot img.estim.regul, 5 img.matcor, 6 plt.cc, 10 plt.indiv, 11 plt.var, 12 Topic multivariate cc, 2 CCA-package, 1 comput, 3 estim.regul, 4 loo, 7 matcor, 8 rcc, 12 cc, 2, 4, 13 CCA (CCA-package), 1 CCA-package, 1 comput, 3 estim.regul, 4, 6, 7, 13 img.estim.regul, 5 img.matcor, 6, 8 loo, 5, 7 matcor, 7, 8 nutrimouse, 9 plt.cc, 3, 10, 11 13 plt.indiv, 10, 11, 12 plt.var, 10, 11, 12 rcc, 3, 4, 12 14