Package BNSL. R topics documented: September 18, 2017

Similar documents
Branch and Bound for Regular Bayesian Network Structure Learning

Package clogitboost. R topics documented: December 21, 2015

Package noncompliance

Package RcppSMC. March 18, 2018

Package NlinTS. September 10, 2018

Package BNN. February 2, 2018

An Empirical-Bayes Score for Discrete Bayesian Networks

Beyond Uniform Priors in Bayesian Network Structure Learning

Package ShrinkCovMat

Package EMVS. April 24, 2018

Package basad. November 20, 2017

Package TSPred. April 5, 2017

Package jmcm. November 25, 2017

Structure Learning: the good, the bad, the ugly

Package gtheory. October 30, 2016

Package RI2by2. October 21, 2016

Package mlmmm. February 20, 2015

Package BayesNI. February 19, 2015

Package SK. May 16, 2018

Package dhga. September 6, 2016

Learning Bayesian Networks

Package locfdr. July 15, Index 5

Package robustrao. August 14, 2017

Package IUPS. February 19, 2015

Package neat. February 23, 2018

Package esreg. August 22, 2017

Package flexcwm. R topics documented: July 11, Type Package. Title Flexible Cluster-Weighted Modeling. Version 1.0.

Package sbgcop. May 29, 2018

Package zic. August 22, 2017

Package aspi. R topics documented: September 20, 2016

STA 4273H: Statistical Machine Learning

Package RTransferEntropy

6.867 Machine learning, lecture 23 (Jaakkola)

Package flexcwm. R topics documented: July 2, Type Package. Title Flexible Cluster-Weighted Modeling. Version 1.1.

Package penmsm. R topics documented: February 20, Type Package

Package depth.plot. December 20, 2015

An Empirical Investigation of the K2 Metric

Scoring functions for learning Bayesian networks

Package polypoly. R topics documented: May 27, 2017

Package beam. May 3, 2018

Readings: K&F: 16.3, 16.4, Graphical Models Carlos Guestrin Carnegie Mellon University October 6 th, 2008

Package idmtpreg. February 27, 2018

Package dhsic. R topics documented: July 27, 2017

Does Better Inference mean Better Learning?

PMR Learning as Inference

Package SEMModComp. R topics documented: February 19, Type Package Title Model Comparisons for SEM Version 1.0 Date Author Roy Levy

Package Blendstat. February 21, 2018

Score Metrics for Learning Bayesian Networks used as Fitness Function in a Genetic Algorithm

Package genphen. March 18, 2019

Package netcoh. May 2, 2016

Package drgee. November 8, 2016

Package invgamma. May 7, 2017

Package rgabriel. February 20, 2015

Package causalweight

Package Tcomp. June 6, 2018

Package noise. July 29, 2016

Package leiv. R topics documented: February 20, Version Type Package

Package MultisiteMediation

Package LBLGXE. R topics documented: July 20, Type Package

A Decision Theoretic View on Choosing Heuristics for Discovery of Graphical Models

Package RootsExtremaInflections

Package RZigZag. April 30, 2018

Package HIMA. November 8, 2017

Package hot.deck. January 4, 2016

Probabilistic Graphical Models

Package jaccard. R topics documented: June 14, Type Package

Constraint-Based Learning Bayesian Networks Using Bayes Factor

Package QuantifQuantile

Package CausalFX. May 20, 2015

Model Averaging (Bayesian Learning)

Package severity. February 20, 2015

Machine Learning Summer School

Package ARCensReg. September 11, 2016

Introduction to Probabilistic Graphical Models

Published in: Tenth Tbilisi Symposium on Language, Logic and Computation: Gudauri, Georgia, September 2013

Package CMC. R topics documented: February 19, 2015

Learning Bayesian belief networks

Package dina. April 26, 2017

Package tspi. August 7, 2017

Package LIStest. February 19, 2015

Package IsingFit. September 7, 2016

Package scio. February 20, 2015

Package bayeslm. R topics documented: June 18, Type Package

Package msu. September 30, 2017

Package bigtime. November 9, 2017

Entropy-based pruning for learning Bayesian networks using BIC

Package Select. May 11, 2018

Package IGG. R topics documented: April 9, 2018

Lecture 6: Graphical Models: Learning

Learning Bayesian Networks Does Not Have to Be NP-Hard

Package Rarity. December 23, 2016

Package MAVE. May 20, 2018

Dimension Reduction (PCA, ICA, CCA, FLD,

Package bivariate. December 10, 2018

Package metafuse. October 17, 2016

Package varband. November 7, 2016

Package xdcclarge. R topics documented: July 12, Type Package

Package OUwie. August 29, 2013

Package SpatialVS. R topics documented: November 10, Title Spatial Variable Selection Version 1.1

Package EBMAforecast

Transcription:

Type Package Title Bayesian Network Structure Learning Version 0.1.3 Date 2017-09-19 Author Package BNSL September 18, 2017 Maintainer Joe Suzuki <j-suzuki@sigmath.es.osaka-u.ac.jp> Depends bnlearn, igraph From a given data frame, this package learns its Bayesian network structure based on a selected score. License GPL (>= 2) Imports Rcpp (>= 0.12.0) LinkingTo Rcpp NeedsCompilation yes Repository CRAN Date/Publication 2017-09-18 07:06:54 UTC R topics documented: BNSL-package....................................... 2 bnsl............................................. 2 cmi............................................. 3 FFtable........................................... 5 kruskal............................................ 6 mi.............................................. 7 mi_matrix.......................................... 8 parent.set.......................................... 9 Index 11 1

2 bnsl BNSL-package Bayesian Network Structure Learning Details From a given dataframe,this package learn a Bayesian network structure based on a seletcted score. Currently,this package estimates of mutual information and conditional mutual information, and combines them to construct either a Bayesian network or a undirected forest, any undirected forest can be a Bayesian network by adding appropriate directions. Maintainer: Joe Suzuki <j-suzuki@sigmath.es.osaka-u.ac.jp> [1] Suzuki, J., A theoretical analysis of the BDeu scores in Bayesian network structure learning", Behaviormetrika, 2017. [2] Suzuki, J., A novel Chow-Liu algorithm and its application to gene differential analysis", International Journal of Approximate Reasoning, 2017. [3] Suzuki, J., Efficient Bayesian network structure learning for maximizing the posterior probability", Next-Generation Computing, 2017. [4] Suzuki, J., An estimator of mutual information and its application to independence testing", Entropy, Vol.18, No.4, 2016. [5] Suzuki, J., Consistency of learning Bayesian network structures with continuous variables: An information theoretic approach". Entropy, Vol.17, No.8, 5752-5770, 2015. [6] Suzuki. J., Learning Bayesian network structures when discrete and continuous variables are present. In Lecture Note on Artificial Intelligence, the sixth European workshop on Probabilistic Graphical Models, Vol. 8754, pp. 471-486,Utrecht, Netherlands, Sept. 2014. Springer-Verlag. [7] Suzuki. J., The Bayesian Chow-Liu algorithms", In the sixth European workshop on Probabilistic Graphical Models, pp. 315-322, Granada, Spain, Sept.2012. [8] Suzuki, J., Branch and Bound for Regular Bayesian Network Structure Learning", Uncertainty in Artificial Intelligence, pp 212-221, Sydney, Australia, August, 2017. bnsl Bayesian Network Structure Learning The function outputs the Bayesian network structure given a dataset based on an assumed criterion. bnsl(df, tw = 0, proc = 1, s=0, n=0, ss=1)

cmi 3 df tw a dataframe. the upper limit of the parent set. proc the criterion based on which the BNSL solution is sought. proc=1,2, and 3 indicates that the structure learning is based on Jeffreys [1], MDL [2,3], and BDeu [3] s n ss The value computed when obtaining the bound. The number of samples. The BDeu parameter. The Bayesian network structure in the bn class of bnlearn. [1] Suzuki, J. An Efficient Bayesian Network Structure Learning Strategy", New Generation Computing, December 2016. [2] Suzuki, J. A construction of Bayesian networks from databases based on an MDL principle", Uncertainty in Artificial Intelligence, pages 266-273, Washington D.C. July, 1993. [3] Suzuki, J. Learning Bayesian Belief Networks Based on the Minimum Length Principle: An Efficient Algorithm Using the B & B Technique", International Conference on Machine Learning, Bali, Italy, July 1996" [4] Suzuki, J. A Theoretical Analysis of the BDeu Scores in Bayesian Network Structure Learning", Behaviormetrika 1(1):1-20, January 2017. parent library(bnlearn) bnsl(asia) cmi Bayesian Estimation of Conditional Mutual Information

4 cmi A standard estimator of conditional mutual information calculates the maximal likelihood value. However, the estimator takes positive values even the pair follows a distribution of two independent variables. On the other hand, the estimator in this package detects conditional independence as well as consistently estimates the true conditional mutual information value as the length grows based on Jeffrey s prior, Bayesian Dirichlet equivalent uniform (BDeu [1]), and the MDL principle. It also estimates the conditional mutual information value even when one of the pair is continuous (see [2]). cmi(x, y, z, proc=0l) x y z proc a numeric vector. a numeric vector. a numeric vector. x, y and z should have an equal length. the estimation is based on Jeffrey s prior, the MDL principle, and BDeu for proc=0,1,2, respectively. If the argument proc is missing, proc=0 (Jeffreys ) is assumed. the estimation of conditional mutual information between the two numeric vectors based on the selected criterion, where the natural logarithm base is assumed. [1] Suzuki, J., A theoretical analysis of the BDeu scores in Bayesian network structure learning", Behaviormetrika, 2017. [2] Suzuki, J., An estimator of mutual information and its application to independence testing", Entropy, Vol.18, No.4, 2016. [3] Suzuki. J. The Bayesian Chow-Liu algorithms", In the sixth European workshop on Probabilistic Graphical Models, pp. 315-322, Granada, Spain, Sept.2012. cmi n=100 x=c(rbinom(n,1,0.2), rbinom(n,1,0.8)) y=c(rbinom(n,1,0.8), rbinom(n,1,0.2))

FFtable 5 z=c(rep(1,n),rep(0,n)) cmi(x,y,z,proc=0); cmi(x,y,z,1); cmi(x,y,z,2) x=c(rbinom(n,1,0.2), rbinom(n,1,0.8)) u=rbinom(2*n,1,0.1) y=(x+u) z=c(rep(1,n),rep(0,n)) cmi(x,y,z); cmi(x,y,z,proc=1); cmi(x,y,z,2) FFtable A faster version of fftable The same procedure as fftable prepared by the R language. The program is written using Rcpp. FFtable(df) df a dataframe. a frequency table of the last column based on the states that are determined by the other columns. fftable library(bnlearn) FFtable(asia)

6 kruskal kruskal Given a weight matrix, generate its maximum weight forest The function lists the edges of an forest generated by Kruskal s algorithm given its weight matrix in which each weight should be symmetric but may be negative. The forest is a spanning tree if the elements of the matrix take positive values. kruskal(w) W a matrix. A matrix object of size n x 2 for matrix size n x n in which each row expresses an edge when the vertexes are expressed by 1 through n. [1] Suzuki. J. The Bayesian Chow-Liu algorithms", In the sixth European workshop on Probabilistic Graphical Models, pp. 315-322, Granada, Spain, Sept.2012. library(igraph) library(bnlearn) df=asia mi.mat=mi_matrix(df) edge.list=kruskal(mi.mat) edge.list g=graph_from_edgelist(edge.list, directed=false) V(g)$label=colnames(df) plot(g)

mi 7 mi Bayesian Estimation of Mutual Information A standard estimator of mutual information calculates the maximal likelihood value. However, the estimator takes positive values even the pair follows a distribution of two independent variables. On the other hand, the estimator in this package detects independence as well as consistently estimates the true mutual information value as the length grows based on Jeffrey s prior, Bayesian Dirichlet equivalent uniform (BDeu [1]), and the MDL principle. It also estimates the mutual information value even when one of the pair is continuous (see [2]). mi(x, y, proc=0) x y proc a numeric vector. a numeric vector. x and y should have a equal length. the estimation is based on Jeffrey s prior, the MDL principle, and BDeu for proc=0,1,2, respectively. If one of the two is continuous, proc=10 should be chosen. If the argument proc is missing, proc=0 (Jeffreys ) is assumed. the estimation of mutual information between the two numeric vectors based on the selected criterion, where the natural logarithm base is assumed. [1] Suzuki, J., A theoretical analysis of the BDeu scores in Bayesian network structure learning", Behaviormetrika, 2017. [2] Suzuki, J., An estimator of mutual information and its application to independence testing", Entropy, Vol.18, No.4, 2016. [3] Suzuki. J. The Bayesian Chow-Liu algorithms", In the sixth European workshop on Probabilistic Graphical Models, pp. 315-322, Granada, Spain, Sept.2012. cmi

8 mi_matrix n=100 x=rbinom(n,1,0.5); y=rbinom(n,1,0.5); mi(x,y) z=rbinom(n,1,0.1); y=(x+z) mi(x,y); mi(x,y,proc=1); mi(x,y,2) x=rnorm(n); y=rnorm(n); mi(x,y,proc=10) x=rnorm(n); z=rnorm(n); y=0.9*x+sqrt(1-0.9^2)*z; mi(x,y,proc=10) mi_matrix Generating its Mutual Information Estimations Matrix The estimators in this package detect independence as well as consistently estimates the true conditional mutual information value as the length grows based on Jeffrey s prior, Bayesian Dirichlet equivalent uniform (BDeu [1]), and the MDL principle. It also estimates the conditional mutual information value even when one of the pair is continuous (see [2]). Given a data frame each column of which may be either discrete or continuous, this function generates its mutual information estimation matrix. mi_matrix(df, proc=0) df proc a data frame. given two discrete vectors of equal length, the function estimates the mutual information based on Jeffrey s prior, the MDL principle, and BDeu for proc=0,1,2, respectively. If one of the columns is continuous, proc=10 should be chosen. If the argument proc is missing, proc=0 (Jeffreys ) is assumed. the estimation of mutual information between the two numeric vectors based on the selected criterion, where the natural logarithm base is assumed.

parent.set 9 [1] Suzuki, J., A theoretical analysis of the BDeu scores in Bayesian network structure learning", Behaviormetrika, 2017. [2] Suzuki, J., An estimator of mutual information and its application to independence testing", Entropy, Vol.18, No.4, 2016. [3] Suzuki. J., A novel Chow?Liu algorithm and its application to gene differential analysis", International Journal of Approximate Reasoning, Vol. 80, 2017. mi library(bnlearn) mi_matrix(asia) mi_matrix(asia,proc=1) mi_matrix(asia,proc=2) mi_matrix(asia,proc=3) parent.set Parent Set This function estimates a parent set of h in each subset w as follows: Suppose we are given a subset w of the p-1 variables excluding h, where p is the number of columns in df. Then, a score is defined for each subset w, where the score expresses how well the subset is likely to be the true parent set of h in w. Currently, a Bayesian score (Jeffreys prior) is applied. This function computes the maximum score z and its subset y of w. This function computes y and z for all w, where w and y are exprssed by binary sequences of length p, respectively. When the computation is heavy, it can be reduced by specifying the maximum size of w, If tw is zero (default), the tw value is set to p-1, Otherwise, the tw value expresses the maximum size. parent.set(df, h, tw=0, proc=1) df h tw proc a data frame. an integer from 0 to p-1, where p is the number of columns in df. an integer from 0 to p-1, where p is the number of columns in df. the parent sets are estimated based on Jeffreys (proc=0,1) [1], MDL (proc=2) [2,3], and BDeu (proc=3) [4].

10 parent.set the data frame in which each row consists of the triples (w,y,z): w is a subset of the p-1 variables excluding h; y is the parent set for w; and z is the score of the parent set. [1] Suzuki, J., An Efficient Bayesian Network Structure Learning Strategy", New Generation Computing, December 2016. [2] Suzuki, A., Construction of Bayesian Networks from Databases Based on an MDL Principle", Proceedings of the Ninth Annual Conference on Uncertainty in Artificial Intelligence, The Catholic University of America, Providence, Washington, DC, USA, July 9-11, 1993. [3] Suzuki, J., Learning Bayesian Belief Networks Based on the Minimum Length Principle: An Efficient Algorithm Using the B & B Technique.", Proceedings of the Thirteenth International Conference (ICML 96), Bari, Italy, July 3-6, 1996. [4] Suzuki, J., A theoretical analysis of the BDeu scores in Bayesian network structure learning", Behaviormetrika, 2017. cmi library(bnlearn) df=asia parent.set(df,7) parent.set(df,7,1) parent.set(df,7,2)

Index Topic package BNSL-package, 2 BNSL (BNSL-package), 2 bnsl, 2 BNSL-package, 2 cmi, 3 FFtable, 5 kruskal, 6 mi, 7 mi_matrix, 8 parent.set, 9 11