Package rnmf. February 20, 2015

Similar documents
Package scio. February 20, 2015

Package gwqs. R topics documented: May 7, Type Package. Title Generalized Weighted Quantile Sum Regression. Version 1.1.0

Package MonoPoly. December 20, 2017

Package RTransferEntropy

Package depth.plot. December 20, 2015

Package ltsbase. R topics documented: February 20, 2015

Package spcov. R topics documented: February 20, 2015

Package factorqr. R topics documented: February 19, 2015

Package polypoly. R topics documented: May 27, 2017

Package gtheory. October 30, 2016

Package varband. November 7, 2016

Package RootsExtremaInflections

Package Tcomp. June 6, 2018

Package mmm. R topics documented: February 20, Type Package

Package sscor. January 28, 2016

Package HIMA. November 8, 2017

Package rarpack. August 29, 2016

Package EMVS. April 24, 2018

Package ForwardSearch

Package aspi. R topics documented: September 20, 2016

Package SpatialVS. R topics documented: November 10, Title Spatial Variable Selection Version 1.1

Package Select. May 11, 2018

Package SK. May 16, 2018

Package leiv. R topics documented: February 20, Version Type Package

Package penmsm. R topics documented: February 20, Type Package

Package ShrinkCovMat

Package darts. February 19, 2015

Package BayesNI. February 19, 2015

Package dhga. September 6, 2016

Package netcoh. May 2, 2016

Package alabama. March 6, 2015

Package nntensor. September 16, 2018

Package TwoStepCLogit

Package dhsic. R topics documented: July 27, 2017

Package R1magic. August 29, 2016

Package Grace. R topics documented: April 9, Type Package

Package velociraptr. February 15, 2017

Package bhrcr. November 12, 2018

Package multivariance

Package pssm. March 1, 2017

Package Rarity. December 23, 2016

Package covtest. R topics documented:

Package RSwissMaps. October 3, 2017

Package rarhsmm. March 20, 2018

Package effectfusion

Package pearson7. June 22, 2016

Package icmm. October 12, 2017

Package Perc. R topics documented: December 5, Type Package

Package SpatialNP. June 5, 2018

Package rbvs. R topics documented: August 29, Type Package Title Ranking-Based Variable Selection Version 1.0.

Package LIStest. February 19, 2015

Package kgc. December 21, Version Date Title Koeppen-Geiger Climatic Zones

Package metafuse. October 17, 2016

Package lomb. February 20, 2015

Package R1magic. April 9, 2013

Package gma. September 19, 2017

Package testassay. November 29, 2016

Package SEMModComp. R topics documented: February 19, Type Package Title Model Comparisons for SEM Version 1.0 Date Author Roy Levy

Package drgee. November 8, 2016

Package ebal. R topics documented: February 19, Type Package

Package MultisiteMediation

Package robustrao. August 14, 2017

Package JointModel. R topics documented: June 9, Title Semiparametric Joint Models for Longitudinal and Counting Processes Version 1.

Package MPkn. May 7, 2018

Package sgr. February 20, 2015

Package CPE. R topics documented: February 19, 2015

Package rgabriel. February 20, 2015

Package jmcm. November 25, 2017

Package nardl. R topics documented: May 7, Type Package

Package bayeslm. R topics documented: June 18, Type Package

Package matrixnormal

Package PeriodicTable

Package ARCensReg. September 11, 2016

Package SimSCRPiecewise

Package abundant. January 1, 2017

Package ICBayes. September 24, 2017

Package SpatPCA. R topics documented: February 20, Type Package

Package covsep. May 6, 2018

Package PVR. February 15, 2013

Package metansue. July 11, 2018

Package sgpca. R topics documented: July 6, Type Package. Title Sparse Generalized Principal Component Analysis. Version 1.0.

Package chemcal. July 17, 2018

Package ensemblebma. R topics documented: January 18, Version Date

Package EnergyOnlineCPM

Package basad. November 20, 2017

Package clogitboost. R topics documented: December 21, 2015

Package unbalhaar. February 20, 2015

Package DNMF. June 9, 2015

Package mpmcorrelogram

Package BNN. February 2, 2018

Package beam. May 3, 2018

Package pfa. July 4, 2016

Package monmlp. R topics documented: December 5, Type Package

Package orclus. R topics documented: March 17, Version Date

Package hot.deck. January 4, 2016

Package flexcwm. R topics documented: July 11, Type Package. Title Flexible Cluster-Weighted Modeling. Version 1.0.

Package separationplot

Package ZIBBSeqDiscovery

Package ExtremeBounds

Package MiRKATS. May 30, 2016

Transcription:

Type Package Title Robust Nonnegative Matrix Factorization Package rnmf February 20, 2015 An implementation of robust nonnegative matrix factorization (rnmf). The rnmf algorithm decomposes a nonnegative high dimension data matrix into the product of two low rank nonnegative matrices, while detecting and trimming outliers. The main function is rnmf(). The package also includes a visualization tool, see(), that arranges and prints vectorized images. Version 0.5.0 Author Yifan Ethan Xu <ethan.yifanxu@gmail.com>, Jiayang Sun <jsun@case.edu> Maintainer Yifan Ethan Xu <ethan.yifanxu@gmail.com> Depends R (>= 2.14.0) Imports nnls, knitr VignetteBuilder knitr License GPL (>= 2) LazyData true NeedsCompilation no Repository CRAN Date/Publication 2015-02-11 07:47:35 R topics documented: face............................................. 2 rnmf............................................. 2 see.............................................. 4 Symbols........................................... 5 Symbols_c.......................................... 6 tumor............................................ 6 Index 7 1

2 rnmf face An Image of a Corrupted Face Image. A 192 by 168 data matrix storing gray-scale pixel intensities of a corrupted face image. The original image was taken from Yale Face Database B [Georghiades et al. 2001] and then manually corrupted. Run the following line to see the image: image(t(face[ncol(face):1, ]), axes = FALSE, col = grey(seq(0, 1, length = 256)), asp = 1) A data matrix containing 192 rows and 168 columns. rnmf Robust Penalized Nonnegative Matrix Factorization (rnmf). rnmf performs robust penalized nonnegative matrix factorizaion on a nonnegative matrix X to obtain low dimensional nonnegative matrices W and H, such that X ~ WH, while detecting and trimming different types of outliers in X. Usage rnmf(x, k = 5, alpha = 0, beta = 0, maxit = 20, tol = 0.001, gamma = FALSE, ini.w = NULL, ini.zeta = NULL, my.seed = NULL, variation = "cell", quiet = FALSE, nreg = 1, showprogress = TRUE) Arguments X k alpha beta maxit a p by n nonnegative numerical matrix to be decomposed into X ~ WH. the integer dimension to which X is projected (equals the number of columns of W and the number of rows of H, and smaller than min(p,n)). Default = 5. a nonnegative number that controls the magnitude of W. The larger the alpha is, the higher penalty on W is, or W is forced to be smaller. Default = 0, representing no penalty on the magnitude W. a nonnegative number that controls the sparsity of H. Default= 0. The larger the beta is, the higher penalty on sum_i H_j is, where H_j is the j-th column of H. Default = 0, representing no penalty on the sparsity of H. the maximum number of iterations. Default = 20. The algorithm is done as follows: 1) fits W given H, then trims outliers in X, i.e. ones with large residuals from X - WH, then refits W without the outliers [this step can be repeated nreg times, currently nreg = 1], then 2) repeat as in 1) with the roles of W and H swapped, and then iterate through 1) and 2) until convergence. Default = 20, allowing a maximum of 20 pairs of 1) and 2).

rnmf 3 tol Details Value gamma ini.w ini.zeta my.seed variation quiet the convergence tolerance. We suggest it to be a positive number smaller than 0.01. Default = 0.001. a trimming percentage in (0, 1) of X. Default = 0.05 will trim 5% of elements of X (measured by cell, row or column as specified in variation ). If trim=0, there is no trim; so rnmf then performs the regular NMF. the initialization of the left matrix W, which is a "p by k" nonnegative numeric matrix. Default = NULL directs the algorithm to randomly generate an ini.w. a "p by n" logical matrix of True or False, indicating the initial locations of the outliers. The number of "TRUE"s in ini.zeta must be less than or equal to m = the rounded integer of (gamma * p * n). Default = NULL, initializes the cells as TRUE with the m largest residuals after the first iteration. Required only for "cell" trimming option. the random seed for initialization of W. If left to be NULL(default) a random seed is used. Required only if ini.w is not NULL. a character string indicating which trimming variation is used. The options are: cell (default), col, row and smooth. default = FALSE indicating a report would be given at the end of execution. If quiet = TRUE, no report is provide at the end. nreg the number of inner loop iterations [see maxit above] to find outliers in X, given either H or W. Default = 1, is currently only implemented option in the "cell" variation. showprogress default = TRUE, shows a progress bar during iterations. If showprogress = FALSE, no progress bar is shown. rnmf decomposes a nonnegative p by n data matrix X as X ~ WH and detect and trims outliers. Here W and H are p by k and k by n nonnegative matrices, respectively; and k <= min{p, n} is the dimension of the subspace to which X is projected. The objective function is X - WH _gamma + alpha * W _2^2 + beta * sum( H_.j )^2 where alpha controls the magnitude of W, and beta controls the sparsity of H. The algorithm iteratively updates W, H and the outlier set zeta with alternating conditional nonnegative least square fittings until convergence. Four variations of trimming are included in the algorithm: "cell", "row", "column" and "smooth". Specifically, the "cell" variation trims individual cell-wise outliers while "row" and "column" variations trim entire row or column outliers. The fourth variation "smooth" fills the cells that are declared outliers in each iteration by the average of the surrounding cells. An object of class rnmf, which is a list of the following items: W: left matrix of the decomposition X ~ WH, columns of which (i.e. W) are basis vectors of the low dimension projection. H: right matrix of the decomposition X ~ WH, columns of which (i.e. W) are low dimensional encoding of the data.

4 see Examples fit: the fitted matrix W %*% H. trimmed: a list of locations of trimmed cells in each iteration. niter: the number of iterations performed. ## Load a clean single simulated tumor image. data("tumor") image(tumor) # look at the tumor image dim(tumor) # it is a 70 by 70 matrix ## Add 5\% corruptions. tumor.corrupted <- tumor set.seed(1) tumor.corrupted[sample(1:4900, round(0.05 * 4900), replace = FALSE)] <- 1 ## Run rnmf with different settings # No trimming res.rnmf1 <- rnmf(x = tumor.corrupted, gamma = FALSE, my.seed = 1) # 6 percent trimming, low dimension k = 5 (default) res.rnmf2 <- rnmf(x = tumor.corrupted, tol = 0.001, gamma = 0.06, my.seed = 1) # add sparsity constraint of H (beta = 0.1) with k = 10, and the "smooth" variation. res.rnmf3 <- rnmf(x = tumor.corrupted, k = 10, beta = 0.1, tol = 0.001, gamma = 0.06, my.seed = 1, variation = "smooth", maxit = 30) ## Show results: par(mfrow = c(2,2), mar = c(2,2,2,2)) image(tumor.corrupted, main = "Corrupted tumor image", xaxt = "n", yaxt = "n") image(res.rnmf1$fit, main = "rnmf (no trimming) fit", xaxt = "n", yaxt = "n") image(res.rnmf2$fit, main = "rnmf (cell) fit 2", xaxt = "n", yaxt = "n") image(res.rnmf3$fit, main = "rnmf (smooth) fit 3", xaxt = "n", yaxt = "n") see Visualize Vectorized Images Usage The function is a wrapper of image(). It arranges and prints multiple or single images. see(x, title = "Image", col = "heat", input = "multi", layout = "auto",...) Arguments X title a numeric matrix. X is either a matrix where each column contains the pixels of a vectorized image, or simply the pixel matrix of one single image. The type of X is indicated by the argument input. a charactor string. Title of the graph.

Symbols 5 col a character string. Defult = "heat". What color scheme to use? Currently allows: "heat" for heat color "br" for (blue-cyan-green-yellow-red) palette "grey" for grey scale input a charactor string with default = "multi", specifying the type of images in X. Possible options are: "multi" if X contains multiple vectorized square images. "single" if X is the matrix of a single image. layout a vector of 2 possible integers or a charactor string "auto" (default). If layout = "auto", multiple images will be arranged in an approximatedly 9 by 16 ratio. If layout = c(a,b), then images will be arranged in a rows and b columns.... further arguments to pass to image(). Details If the input is a matrix of vectorized images (input = "multi", default setting), that is, each column contains pixels of one vectorized image, then see() restores each column into a matrix and show all images in one frame. Current version assumes the images are squared images. If the input is a matrix of one image (input = "single"), see() shows this image. Different color palette can be selected by specify the "col" argument. Build-in color palette includes greyscale, blue-red and heat color. Examples ## Load a build-in data set Symbols, a 5625 by 30 matrix containing 30 75x75 ## images. data(symbols) see(symbols, title = "Sample images of four symbols") Symbols 30 Images Containing 4 Different Symbols. A 5625 by 30 matrix. Each column stores vectorized pixels of a 75 by 75 image. Every image contains one or more of four shapes: circle, square, triangle and cross. The pixel intensities are standarized to 0 to 1. Run see(symbols) to visualize. A 5625 by 30 numeric matrix.

6 tumor Symbols_c 30 Images Containing 4 Different Symbols with Corruptions. A 5625 by 30 matrix. Each column stores vectorized pixels of a 75 by 75 image. Every image contains one or more of four shapes: circle, square, triangle and cross. The pixel intensities are standarized to 0 to 1. Box corruptions are randomly added. Run see(symbols_c) to visualize. A 5625 by 30 numeric matrix. tumor An Image of 3 Simulated Tumors. A 70 by 70 data matrix storing pixel intensities of three tumors that are normalized to 0 to 1. A 70 by 70 numeric matrix.

Index Topic datasets face, 2 Symbols, 5 Symbols_c, 6 tumor, 6 face, 2 rnmf, 2 see, 4 Symbols, 5 Symbols_c, 6 tumor, 6 7