Package coop. November 14, 2017

Similar documents
Package gtheory. October 30, 2016

Package ShrinkCovMat

Package depth.plot. December 20, 2015

Package robustrao. August 14, 2017

Package Delaporte. August 13, 2017

Package varband. November 7, 2016

Package polypoly. R topics documented: May 27, 2017

Package Tcomp. June 6, 2018

Package invgamma. May 7, 2017

Package cvequality. March 31, 2018

Package NlinTS. September 10, 2018

Package magma. February 15, 2013

Package rbvs. R topics documented: August 29, Type Package Title Ranking-Based Variable Selection Version 1.0.

Package scio. February 20, 2015

Package SHIP. February 19, 2015

Package sscor. January 28, 2016

Package HIMA. November 8, 2017

Package thief. R topics documented: January 24, Version 0.3 Title Temporal Hierarchical Forecasting

Package locfdr. July 15, Index 5

Package FinCovRegularization

Package cartogram. June 21, 2018

Package beam. May 3, 2018

Package jmcm. November 25, 2017

Package EMVS. April 24, 2018

Package unbalhaar. February 20, 2015

Package funrar. March 5, 2018

Package reinforcedpred

Package scpdsi. November 18, 2018

Package IndepTest. August 23, 2018

Package CPE. R topics documented: February 19, 2015

Package monmlp. R topics documented: December 5, Type Package

Package sbgcop. May 29, 2018

Package GD. January 12, 2018

Package covsep. May 6, 2018

Package EnergyOnlineCPM

Package multivariance

Package bivariate. December 10, 2018

Package SEMModComp. R topics documented: February 19, Type Package Title Model Comparisons for SEM Version 1.0 Date Author Roy Levy

Package rdefra. March 20, 2017

Package mlmmm. February 20, 2015

Package esreg. August 22, 2017

Package platetools. R topics documented: June 25, Title Tools and Plots for Multi-Well Plates Version 0.1.1

Package face. January 19, 2018

Package noise. July 29, 2016

Package RTransferEntropy

Package MicroMacroMultilevel

Package denseflmm. R topics documented: March 17, 2017

Package bmem. February 15, 2013

Package supernova. February 23, 2018

Package CEC. R topics documented: August 29, Title Cross-Entropy Clustering Version Date

Package Rarity. December 23, 2016

Package KMgene. November 22, 2017

Package TSPred. April 5, 2017

Package weathercan. July 5, 2018

Package BNN. February 2, 2018

Package matrixnormal

Package camsrad. November 30, 2016

Package AMA. October 27, 2010

Package dhga. September 6, 2016

Package rnmf. February 20, 2015

Package jaccard. R topics documented: June 14, Type Package

Package icmm. October 12, 2017

Package gk. R topics documented: February 4, Type Package Title g-and-k and g-and-h Distribution Functions Version 0.5.

Package MultisiteMediation

Package abundant. January 1, 2017

Package dhsic. R topics documented: July 27, 2017

Package leiv. R topics documented: February 20, Version Type Package

Package Grace. R topics documented: April 9, Type Package

Package LCAextend. August 29, 2013

Package SK. May 16, 2018

Package Metrics. November 3, 2017

Package mpmcorrelogram

Package effectfusion

Package BayesNI. February 19, 2015

Package bigtime. November 9, 2017

Package idmtpreg. February 27, 2018

Package nowcasting. April 25, Type Package

Package HarmonicRegression

Package misctools. November 25, 2016

Package r2glmm. August 5, 2017

Package MonoPoly. December 20, 2017

Package noncompliance

Package chemcal. July 17, 2018

Package ltsbase. R topics documented: February 20, 2015

Package sgpca. R topics documented: July 6, Type Package. Title Sparse Generalized Principal Component Analysis. Version 1.0.

Package goftest. R topics documented: April 3, Type Package

Package lsei. October 23, 2017

Package rarpack. August 29, 2016

Package HDtest. R topics documented: September 13, Type Package

Package NanoStringDiff

Package kgc. December 21, Version Date Title Koeppen-Geiger Climatic Zones

Package MAPA. R topics documented: January 5, Type Package Title Multiple Aggregation Prediction Algorithm Version 2.0.4

Consider the following example of a linear system:

Package convospat. November 3, 2017

Package xdcclarge. R topics documented: July 12, Type Package

Package R1magic. August 29, 2016

Package rgabriel. February 20, 2015

Package cccrm. July 8, 2015

Package TideHarmonics

Package factorqr. R topics documented: February 19, 2015

Transcription:

Type Package Package coop November 14, 2017 Title Co-Operation: Fast Covariance, Correlation, and Cosine Similarity Operations Version 0.6-1 Fast implementations of the co-operations: covariance, correlation, and cosine similarity. The implementations are fast and memory-efficient and their use is resolved automatically based on the input data, handled by R's S3 methods. Full descriptions of the algorithms and benchmarks are available in the package vignettes. License BSD 2-clause License + file LICENSE Depends R (>= 3.1.0) Enhances slam (>= 0.1.32), Matri Suggests memuse NeedsCompilation yes ByteCompile yes URL https://github.com/wrathematics/coop BugReports https://github.com/wrathematics/coop/issues Maintainer <wrathematics@gmail.com> RoygenNote 5.0.1 Author [aut, cre], Christian Heckendorf [ctb] (Caught some memory errors.) Repository CRAN Date/Publication 2017-11-14 16:53:17 UTC R topics documented: coop-package........................................ 2 cosine............................................ 3 covar............................................. 4 1

2 coop-package pcor............................................. 5 scaler............................................ 6 sparsity........................................... 6 weighted........................................... 7 Inde 9 coop-package Cooperation: A Package of Co-Operations Fast implementations of the co-operations: covariance, correlation, and cosine similarity. The implementations are fast and memory-efficient and their use is resolved automatically based on the input data, handled by R s S3 methods. Full descriptions of the algorithms and benchmarks are available in the package vignettes. Covariance and correlation should largely need no introduction. Cosine similarity is commonly needed in, for eample, natural language processing, where the cosine similarity coefficients of all columns of a term-document or document-term matri is needed. The inplace argument When computing covariance and correlation with dense matrices, we must operate on the centered and/or scaled input data. When inplace=false, a copy of the matri is made. This allows for very wall-clock efficient processing at the cost of m*n additional double precision numbers allocated. On the other hand, if inplace=true, then the wall-clock performance will drop considerably, but at the memory epense of only m+n additional doubles. For perspective, given a 30,00030,000 matri, a copy of the data requires an additional 6.7 GiB of data, while the inplace method requires only 469 KiB, a 15,000-fold difference. Note that cosine is always computed in place. The t functions The package also includes "t" functions, like tcosine(). These behave analogously to tcrossprod() as crossprod() in base R. So of cosine() operates on the columns of the input matri, then tcosine() operates on the rows. Another way to think of it is, tcosine() = cosine(t()). Implementation Multiple storage schemes for the input data are accepted. For dense matrices, an ordinary R matri input is accepted. For sparse matrices, a matri in COO format, namely simple_triplet_matri from the slam package, is accepted. The implementation for dense matri inputs is dominated by a symmetric rank-k update via the BLAS subroutine dsyrk; see the package vignette for a discussion of the algorithm implementation and compleity. The implementation for two dense vector inputs is dominated by the product t() %*% y performed by the BLAS subroutine dgemm and the normalizing products t(y) %*% y, each computed via the BLAS function dsyrk.

cosine 3 cosine Cosine Similarity Compute the cosine similarity matri efficiently. The function synta and behavior is largely modeled after that of the cosine() function from the lsa package, although with a very different implementation. cosine(, y, use = "everything", inverse = FALSE) tcosine(, y, use = "everything", inverse = FALSE) y use inverse A numeric dataframe/matri or vector. A vector (when is a vector) or missing (blank) when is a matri. The NA handler, as in R s cov() and cor() functions. Options are "everything", "all.obs", and "complete.obs". Logical; should the inverse covariance matri be returned? See?coop-package for implementation details. The n n matri of all pair-wise vector cosine similarities of the columns. See Also sparsity Eamples <- matri(rnorm(10*3), 10, 3) coop::cosine() coop::cosine([, 1], [, 2])

4 covar covar Covariance An optimized, efficient implemntation for computing covariance. covar(, y, use = "everything", inplace = FALSE, inverse = FALSE) tcovar(, y, use = "everything", inplace = FALSE, inverse = FALSE) y use inplace inverse A numeric dataframe/matri or vector. A vector (when is a vector) or missing (blank) when is a matri. The NA handler, as in R s cov() and cor() functions. Options are "everything", "all.obs", and "complete.obs". Logical; if TRUE then the method used is slower but uses less memory than if FALSE. See?coop-package for details. Logical; should the inverse covariance matri be returned? See?coop-package for implementation details. The covariance matri. See Also cosine Eamples <- matri(rnorm(10*3), 10, 3) coop::pcor() coop::pcor([, 1], [, 2])

pcor 5 pcor Pearson Correlation An optimized, efficient implemntation for computing the pearson correlation. pcor(, y, use = "everything", inplace = FALSE, inverse = FALSE) tpcor(, y, use = "everything", inplace = FALSE, inverse = FALSE) y use inplace inverse A numeric dataframe/matri or vector. A vector (when is a vector) or missing (blank) when is a matri. The NA handler, as in R s cov() and cor() functions. Options are "everything", "all.obs", and "complete.obs". Logical; if TRUE then the method used is slower but uses less memory than if FALSE. See?coop-package for details. Logical; should the inverse covariance matri be returned? See?coop for implementation details. The pearson correlation matri. See Also cosine Eamples <- matri(rnorm(10*3), 10, 3) coop::pcor() coop::pcor([, 1], [, 2])

6 sparsity scaler scaler A function to center (subtract mean) and/or scale (divide by standard deviation) data column-wise in a computationally efficient way. scaler(, center = TRUE, scale = TRUE) center, scale The input matri. Logical; determine if the data should be centered and/or scaled. Unlike its R counterpart scale(), the arguments center and scale can only be logical values (and not vectors). The centered/scaled data, with attributes as in R s scale(). sparsity Sparsity Show the sparsity (as a count or proportion) of a matri. For eample,.99 sparsity means 99% of the values are zero. Similarly, a sparsity of 0 means the matri is fully dense. sparsity(, proportion = TRUE) proportion The matri, stored as an ordinary R matri or as a "simple triplet matri" (from the slam package). Logical; should a proportion or a count be returned?

weighted 7 The implementation is very efficient for dense matrices. For sparse triplet matrices, the count is trivial. The sparsity of the input matri, as a proportion or a count. Eamples ## Completely sparse matri <- matri(0, 10, 10) coop::sparsity() ## 15\% density / 85\% sparsity [sample(length(), size=15)] <- 1 coop::sparsity() weighted Weighted Co-Operation An optimized, efficient implemntation for computing weighted covariance, correlation, and cosine similarity. Similar to R s cov.wt(). wt method A matri or data.frame. A vector of weights or scalar weight. Either "unbiased" or "ml". Unlike R, case is ignored. See?coop-package for implementation details. See Also cosine, pcor, and covar

8 weighted Eamples <- matri(rnorm(10*3), 10, 3) cov.wt()

Inde Topic package coop-package, 2 coop-package, 2 cosine, 3, 4, 5, 7 covar, 4, 7 pcor, 5, 7 scaler, 6 sparsity, 3, 6 tcosine (cosine), 3 tcovar (covar), 4 tpcor (pcor), 5 weighted, 7 9