Package ENA. February 15, PDF Free Download

Package ENA February 15, 2013 Type Package Title Ensemble Network Aggregation Version 1.2-4 Date 2013-02-14 Author Maintainer Depends R (>= 2.13.0), space (>= 0.1), WGCNA (>= 1.20), GeneNet(>= 1.2.5), parmigene (>= 1.0) Suggests snow (>= 0.3) Ensemble network aggregation is an approach which leverages the inverse-rank-product (IRP) method to combine networks. This package provides the capabilities to use IRP to bootstrap a dataset using a single method, to aggregate the networks produced by multiple methods, or to aggregate the networks produced on different datasets. License GPL (>= 2) LazyLoad yes Collate bootstrap.r buildaracne.r buildgenenet.r buildspace.r buildwgcna.r ena.r symmetricize.r gettableaddres NeedsCompilation no Repository CRAN Date/Publication 2013-02-15 06:38:22 1

2 adj2mat R topics documented: adj2mat........................................... 2 bootstrap.......................................... 3 buildaracne......................................... 4 buildgenenet........................................ 5 buildspace.......................................... 5 buildwgcna......................................... 6 edgecutoff......................................... 7 ena.............................................. 8 gettableaddressing..................................... 9 mat2adj........................................... 10 net1345........................................... 11 net17............................................ 11 net231............................................ 11 net44............................................ 12 net613............................................ 12 net83............................................ 12 simulatenetwork...................................... 13 symmetricize........................................ 13 tri2mat............................................ 15 Index 16 adj2mat Convert adjacency list into an adjacency matrix. Converts an adjacency-like list (which may or may not contain all the gene IDs in the network) into an adjacency matrix. adj2mat(adjlist, IDs = sort(union(adjlist[, 1], adjlist[, 2]))) adjlist IDs the adjacency list of the matrix you re looking to simulate. There should be two-three columns for source, target, and the (optional) regulation value. If the regulation value isn t specified, it s assumed to be 1. To avoid any confusion, we require that the columns be named exactly "Source", "Target", and "Regulation", where "Regulation" represents the strength or weight of the edge in the network. The set of genes in this network. By default, the set of all genes mentioned in the adjacency matrix. IDs can be provided if there are unconnected genes in the network which aren t mentioned in the adjacency list or when the ordering of the genes is important.

bootstrap 3 #Load in the sample Protein-Protein-Interaction data that comes with this package. #Simulate a dataset based on the 44-gene topology provided. sim <- simulatenetwork(net44) #Convert the true, underlying adjacency list to an adjacency matrix truemat <- adj2mat(net44) bootstrap Bootstrap the reconstruction of a network Randomly selects a subset of the avaialble samples and performs a network reconstruction using the selected technique. Aggregate all produced networks into a single network using the ena method. data bootstrap(data, fun, sample.percentage = 0.7, iterations = 150, cluster, truth) The dataset to reconstruct. Each column should contain one sample, and each row should contain one gene. fun The network reconstruction technique to employ while bootstrapping. Could be one of the provided methods such as buildspace or a custom function. Provide the name of the function in quotes. sample.percentage The percentage of samples to select for each iteration. iterations cluster truth The number of bootstrapping iterations to perform i.e. the number of networks to build Optionally provide an RMPI cluster (of class MPIcluster) to distribute the workload across. The true network structure. Typically not available, but useful in testing and debugging. A data.frame representing the adjacency list of the ENA-produced network.

4 buildaracne #Load in the sample Protein-Protein- Interaction data that comes with this package. set.seed(123) #Simulate a dataset based on the 44-gene topology provided. sim <- simulatenetwork(net44) boot <- bootstrap(sim, "buildgenenet",.9, 10, ) bootmat <- tri2mat(rownames(sim), boot[,3]) buildaracne Reconstruct network using Aracne Reconstructs a gene regulatory network using the Aracne algorithm buildaracne(mat) mat The matrix on which to reconstruct. The matrix should store one gene per row, and one sample per column. The adjacency matrix of the genes provided. #Load in the sample PPI data provided with this package #Simulate the network based on one of the adjacency lists just loaded. net <- simulatenetwork(net44) #Reconstruct the network using GeneNet, then grab the upper traingular portion # of the matrix ar <- abs(buildaracne(net)) ar <- ar[upper.tri(ar)]

buildgenenet 5 buildgenenet Reconstruct network using GeneNet Reconstructs a gene regulatory network using the GeneNet algorithm buildgenenet(data) data The matrix on which to reconstruct. The matrix should store one gene per row, and one sample per column. The adjacency matrix of the genes provided. #Load in the sample PPI data provided with this package #Simulate the network based on one of the adjacency lists just loaded. net <- simulatenetwork(net44) #Reconstruct the network using GeneNet, then grab the upper traingular portion # of the matrix gn <- abs(buildgenenet(net)) gn <- gn[upper.tri(gn)] buildspace Reconstruct network using SPACE Reconstructs a gene regulatory network using the SPACE algorithm buildspace(data)

6 buildwgcna data The matrix on which to reconstruct. The matrix should store one gene per row, and one sample per column. The adjacency matrix of the genes provided. #Load in the sample PPI data provided with this package #Simulate the network based on one of the adjacency lists just loaded. net <- simulatenetwork(net44) #Process with SPACE sp <- abs(buildspace(net)) sp <- sp[upper.tri(sp)] buildwgcna Reconstruct network using WGCNA Reconstructs a gene regulatory network using the WGCNA algorithm buildwgcna(mat) mat The matrix on which to reconstruct. The matrix should store one gene per row, and one sample per column. Note that this is the transpose of how WGCNA typically accepts their matrix. The adjacency matrix of the genes provided.

edgecutoff 7 #Load in the sample PPI data provided with this package #Simulate the network based on one of the adjacency lists just loaded. net <- simulatenetwork(net44) #Process with WGCNA wg <- abs(buildwgcna(net)) edgecutoff Compute a binary network from a continuous rank-product network. Compute a binary network from a continuous rank-product network. edgecutoff(rp, nnets, pfp = 0.05, nperm = 100) rp nnets pfp nperm The rank product of the network you wish to binarize. This can either be the adjacency matrix of the network, or just the upper triangle of that network, as could be computed by the ena function. The number of networks used to compute the given rank product The percentage of false positives to use as a cutoff The number of permutations to run when calculating the pfps of the network. The binarized network in which only edges surpassing the specified significance level are maintained. If rp was provided as a named matrix, the results will also be a matrix. If rp was a vector of the upper triangle, the result will also be a vector of the upper triangle. Note that an upper triangle can be converted back to an adjacency matrix using tri2mat. #Load in the sample PPI data provided with this package #Simulate the network based on one of the adjacency lists just loaded. net <- simulatenetwork(net44) #Reconstruct the network using GeneNet, then grab the upper traingular portion # of the matrix

8 ena gn <- abs(buildgenenet(net)) gn <- gn[upper.tri(gn)] #Process with WGCNA wg <- abs(buildwgcna(net)) wg <- wg[upper.tri(wg)] #Process with SPACE sp <- abs(buildspace(net)) sp <- sp[upper.tri(sp)] #Aggregate methods using ENA ena <- ena(cbind(gn, wg, sp)) #Convert from a triangular vector to a full matrix. enamat <- tri2mat(rownames(net), ena) #Extract only those edges in the graph which pass the cutoff binarized <- edgecutoff(enamat, 3) ena Perform ensemble network aggregation Manipulates adjacency-list-formatted networks into a single adjacency list ena(adjacencylist, method = c("rankprod")) adjacencylist method A data.frame in which each column represents the connection weights of a network. Each row represents a possible connection within the network. Currently only support (Inverse) Rank Product, specified by "RankProd" A single adjacency list representing the Inverse Rank Product of all connections in the provided networks.

gettableaddressing 9 #Load in the sample PPI data provided with this package #Simulate the network based on one of the adjacency lists just loaded. net <- simulatenetwork(net44) #Reconstruct the network using GeneNet, then grab the upper traingular portion # of the matrix gn <- abs(buildgenenet(net)) gn <- gn[upper.tri(gn)] #Process with WGCNA wg <- abs(buildwgcna(net)) wg <- wg[upper.tri(wg)] #Process with SPACE sp <- abs(buildspace(net)) sp <- sp[upper.tri(sp)] #Aggregate methods using ENA ena <- ena(cbind(gn, wg, sp)) #Convert from a triangular vector to a full matrix. enamat <- tri2mat(rownames(net), ena) gettableaddressing Get the adjacency list addressing template. Useful if you want to store the networks in their condensed upper-diagonal form while still having the benefit of convenient addressing and/or if you are using a simulated dataset in which you know the truth and want to store all the values in a single data.frame. gettableaddressing(variablenames, truth) variablenames truth the names of all genes to include in the adjacency list The true adjacency matrix. Often will not be available, but is useful for debugging and testing. Details Internal function used to get the addressing template for a data.frame to contain the adjacency list representation of a matrix.

10 mat2adj A data.frame representing the adjacency list of the matrix provided. #Load in the sample Protein-Protein-Interaction data that comes with this package. #Simulate a dataset based on the 44-gene topology provided. sim <- simulatenetwork(net44) #Convert the true, underlying adjacency list to an adjacency matrix truemat <- adj2mat(net44) #Reconstruct using GeneNet gn <- abs(buildgenenet(sim)) gn <- gn[upper.tri(gn)] wg <- abs(buildwgcna(sim)) wg <- wg[upper.tri(wg)] #Aggregate all results into a single data.frame data <- gettableaddressing(rownames(sim), truemat) data <- cbind(data, gn, wg) mat2adj Convert a matrix to an adjacency list Takes a matrix and converts all non-zero elements to an adjacency list using the row/colnames as the names for this list. Currently, the matrix must be symmetric. mat2adj(adjmat) adjmat The symmetric adjacency matrix with rows and columns named.

net1345 11 mat <- matrix(c(1,4,0,4,1,2,0,2,1), ncol=3) rownames(mat) <- colnames(mat) <- letters[1:3] mat2adj(mat) net1345 Sample network topology using 1345 genes with interactions based on observed protein-protein interaction networks. Sample network topology using 1345 genes with interactions based on observed protein-protein interaction networks. Guanghua Xiao <Guanghua.Xiao@UTSouthwestern.edu> net17 Sample network topology using 17 genes with interactions based on observed protein-protein interaction networks. Sample network topology using 17 genes with interactions based on observed protein-protein interaction networks. Guanghua Xiao <Guanghua.Xiao@UTSouthwestern.edu> net231 Sample network topology using 231 genes with interactions based on observed protein-protein interaction networks. Sample network topology using 231 genes with interactions based on observed protein-protein interaction networks. Guanghua Xiao <Guanghua.Xiao@UTSouthwestern.edu>

12 net83 net44 Sample network topology using 44 genes with interactions based on observed protein-protein interaction networks. Sample network topology using 44 genes with interactions based on observed protein-protein interaction networks. Guanghua Xiao <Guanghua.Xiao@UTSouthwestern.edu> net613 Sample network topology using 613 genes with interactions based on observed protein-protein interaction networks. Sample network topology using 613 genes with interactions based on observed protein-protein interaction networks. Guanghua Xiao <Guanghua.Xiao@UTSouthwestern.edu> net83 Sample network topology using 83 genes with interactions based on observed protein-protein interaction networks. Sample network topology using 83 genes with interactions based on observed protein-protein interaction networks. Guanghua Xiao <Guanghua.Xiao@UTSouthwestern.edu>

simulatenetwork 13 simulatenetwork Simulate a gene expression dataset. Simulates the observed gene expression levels in a dataset using the underlying truth network provided, allowing cusomtization of the number of samples and the noise levels in the dataset. simulatenetwork(adjlist, genes = sort(union(adjlist[, 1], adjlist[, 2])), samples = 100, noise = 1) adjlist genes samples noise the adjacency list of the matrix you re looking to simulate. The first column should be the source and the second column the target. To avoid any confusion, we require that the columns be named exactly "Source" and "Target". The list of all genes in the network. By default this is any gene mentioned in the adjacency list. The number of samples you wish to simulate the amount of noise present in the simulated expression levels. Guanghua Xiao <Guanghua.Xiao@UTSouthwestern.edu> #Load in the sample PPI data provided with this package #Simulate the network based on one of the adjacency lists just loaded. net <- simulatenetwork(net44) symmetricize Make a matrix symmetric Make the matrix symmetric by making all "mirrored" positions consistent. A variety of methods are provided to make the matrix symmetrical.

14 symmetricize symmetricize(matrix, method = c("max", "min", "avg", "ld", "ud"), adjacencylist = FALSE) matrix method The matrix to make symmatric The method to use to make the matrix symmetric. Default is to take the maximum. "max" For each position, m i,j, use the maxiumum of (m i,j, m j,i ) "min" For each position, m i,j, use the minimum of (m i,j, m j,i ) "avg" For each position, m i,j, use the mean: (m i,j + m j,i )/2 "ld" Copy the lower triangular portion of the matrix to the upper triangular portion. "ud" Copy the upper triangular portion of the matrix to the lower triangular portion. adjacencylist Logical. If false, returns the symmetric matrix (the same format as the input). If true, returns an adjacency list representing the upper triangular portion of the adjacency matrix with addressing based on the row.names of the matrix provided. The symmetric matrix #Create a sample 3x3 matrix mat <- matrix(1:9, ncol=3) #Copy the upper diagonal portion to the lower symmetricize(mat, "ud") #Take the average of each symmetric location symmetricize(mat, "avg")

tri2mat 15 tri2mat Convert triangular elements to full matrix. Converts the upper or lower-triangular portion of a matrix back to the complete 2D matrix using the gene names provided. The matrix is assumed to be symmetrical. tri2mat(genes, tri, diag = 1, upper = TRUE) genes tri diag upper The names of the genes to use as row and column names. Note that these must be in the original order as was used when the traingular portion was extracted from the matrix. Otherwise, the matrix will not be constructed correctly. The triangular elements of the matrix. Could be extracted using a command like mat[upper.tri(mat)] The value to use for the diagonal elements in the matrix. TRUE if tri represents the upper triangular portion of a matrix, FALSE if the lower. upper.tri and lower.tri extract the elements in a different order. the complete 2D matrix represented by the traingular portion provided. #Load in the sample PPI data provided with this package #Simulate the network based on one of the adjacency lists just loaded. net <- simulatenetwork(net44) #Reconstruct the network using GeneNet, then grab the upper traingular portion # of the matrix gn <- abs(buildgenenet(net)) gn <- gn[upper.tri(gn)] #Convert from a triangular vector to a full matrix. gnmat <- tri2mat(rownames(net), gn)

Index Topic datasets net1345, 11 net17, 11 net231, 11 net44, 12 net613, 12 net83, 12 adj2mat, 2 bootstrap, 3 buildaracne, 4 buildgenenet, 5 buildspace, 5 buildwgcna, 6 edgecutoff, 7 ena, 7, 8 gettableaddressing, 9 lower.tri, 15 mat2adj, 10 net1345, 11 net17, 11 net231, 11 net44, 12 net613, 12 net83, 12 simulatenetwork, 13 symmetricize, 13 tri2mat, 7, 15 upper.tri, 15 16

Package ENA. February 15, 2013