A New Evolutionary Computation Based Approach for Learning Bayesian Network

Avalable onlne at www.scencedrect.com Proceda Engneerng 15 (2011) 4026 4030 Advanced n Control Engneerng and Informaton Scence A New Evolutonary Computaton Based Approach for Learnng Bayesan Network Yungang Zhu a,b, Dayou Lu a,b *,Hayang Ja a,b a College of Computer Scence and Technology, Jln Unversty, Changchun 130012, Chna b Key Laboratory of Symbolc Computaton and Knowledge Engneerng of Mnstry of Educaton, Jln Unversty, Changchun 130012, Chna Abstract Bayesan network s a popular tool for uncertanty process n Artfcal Intellgence. In recent years, more and more attenton has been pad to learnng of Bayesan network. In ths paper, we proposed a novel learnng algorthm for Bayesan network based on (μ, λ)-evoluton Strategy, we present the encodng scheme and ftness functon, desgned the evolutonary operators of recombnaton, mutaton and selecton. Theoretcal analyss and expermental results all demonstrate that the proposed method can learn the Bayesan network from data effectvely. 2011 Publshed by Elsever Ltd. Selecton and/or peer-revew under responsblty of [CEIS 2011] Open access under CC BY-NC-ND lcense. Keywords: Bayesan network; Evoluton strategy; Evolutonary computaton 1. Introducton Bayesan network has been a powerful tool for managng uncertanty. It has been successfully appled to expert system, dagnoss system, and decson support system et al. Bayesan network ntegrates graphcal model and probablty theory, and t ndcates the nternal relatonshp among varables. In recent years, many researchers pay much attenton to the learnng algorthm for Bayesan network. Learnng the structure of a Bayesan network can be consdered a specfc example of the general problem of selectng a probablstc model that explans a gven set of data[1]. * Correspondng author. E-mal address: dylu@jlu.edu.cn. 1877-7058 2011 Publshed by Elsever Ltd. do:10.1016/j.proeng.2011.08.755 Open access under CC BY-NC-ND lcense.

Yungang Zhu et al. / Proceda Engneerng 15 (2011) 4026 4030 4027 In ths paper we proposed a novel learnng algorthm for Bayesan network based on Evoluton Strategy. In detal, we adopt (μ, λ)-evoluton Strategy whch s an evolutonary computng method to learn Bayesan network. The rest of ths paper s organzed as follows: Secton 2 presents bref background knowledge. The method we proposed s descrbed n Secton 3. Secton 4 presents the expermental results and dscusson of the proposed method. Fnally, conclusons are summarzed n Secton 5. 2. Theoretcal Background A Bayesan network (BN) s a graphcal model for representng relatonshps among varables. Let us consder a set of varables X={X 1,X 2,,X n }, a Bayesan network s a tuple (G, Θ), where G s a drected acyclc graph (DAG), each node of G represents the varable, and each drected edge represents relatonshps between varables; and Θ={ P(X π ),1 n } represents the local condtonal probablty dstrbuton of each node gven the values of ther parent nodes, where π s the parent set of X. 1 r Assumed the range of X s { 1 q x,..., x }, the range of π s{ π,..., π }, the local condtonal probablty k dstrbuton of each node s represented by j n q r θ jk = P ( X = x π = π ), and Θ = U U U { θ. 1 j 1 k 1 jk} = = = Evoluton Strategy (ES) s one of the evolutonary computng methods. ES produces consecutve generatons of ndvdual, durng a generaton a selecton method s used to select specfc ndvduals whch form the new generaton by recombnaton and mutaton [2]. 3. Learnng Bayesan network based on Evoluton Strategy The problem of learnng of Bayesan network can be stated as follows. Assumng that D represent the data, the purpose s to obtan a Bayesan network S that best ft the D. 3.1. Encodng The Bayesan network structure was encoded nto adjacency matrx or adjacency lst n prevous codng way, but the method wll result n a large amount of cyclc graphs whch are llegal structures. Based on the codng scheme n [3][4], n ths paper, the code s dvded nto 3 parts. The 1st part s a sequence of nodes: Ths order s the reverse of topologcal sort of the network nodes, so there s no cycle. For example, the sequence of Fgure 1 s 54231. Fg. 1. A smple example of Bayesan network. The 2nd part has n-1 segments, each segment ndcate the parents of each node n the sequence above, the last node has no parent, so only n-1 segments needed. For example, the structure of Fgure 1, the second part of code s 1110 111 10 0. segment 1 ndcate that for node 5, node 4,2,3 s parent, and Node 1

4028 Yungang Zhu et al. / Proceda Engneerng 15 (2011) 4026 4030 s not parent; segment 2 ndcate that for node 4, node 2,3,1 s parent; segment 3 ndcate for Node 2, Node 3 s parent, and Node 1 s not; and so on. Further more, consderng smple of network structure, we restrct the number of parents of each node can not exceed k. In general, k s much smaller than n, and usng the code above wll nclude lots of 0. So we use compress form of the code, that s only recordng the poston of 1 n the order of sequence from frst part, thus the fnal code s [234 345 4 0]. The 3rd Part s adaptve step sze n mutaton evolutonary strategy σ. So n summary, the code of Fgure 1 s [54231 234 345 4 0 σ]. 3.2. Ftness functon We use Bayesan Informaton Crteron (BIC)[1] scorng measure to be the Ftness functon, that s as follows: 1 Ftness ( S) = log P( D S) Pen( S) = ( Njk logθjk ) log L π ( X 1) j k 2 k j where s number of sample whch X = and π = n data D. L s the number of samples. X s N jk X the number of assgnment of X. and P ( D S ) measures the ftness of S to data D, Pen(S) s penalty functon about the structure of S to make the learnng algorthm trend to obtan concse model whch s easy for management. 3.3. Evolutonary operator Recombnaton The Recombnaton of evoluton strategy s equvalent of the cross for genetc algorthm. But unlke GA, the Recombnaton generates only one ndvdual from two parent ndvduals. For the frst part of the code, we use Partally Matched Crossover of GA[5], selectng a new ndvdual from the two resultng ndvduals randomly. For the second part of the code, for each segment, a segment s selected randomly from the two parent ndvduals as the segment of the chld ndvdual. The thrd part s the sze of step n Mutaton, usng the md-value for Recombnaton. If the thrd part of two parents are σ 1 and σ 2, after Recombnaton, t s (σ 1 +σ 1 )/2. Mutaton There are three types of mutaton operator: the addng an arc, deletng an arc, and reversng an arc. It s not to perform one mutaton operator durng mutaton, but to perform σ N(0,1) mutaton operators, N(0,1) s normally dstrbuted random varable whch mean=0 and varance=1. Selecton Selecton s strctly accordng to the ftness, elmnatng all the poor ndvduals, selectng all the good ones. The proposed algorthm usng (μ, λ) selecton strategy: μ parent ndvduals generate λ (λ>μ) chldren ndvduals, and select μ ndvduals from the resultng λ ndvduals as the next generaton. π In summary, the pseudo-code of the learnng algorthm based on Procedure ESBN (data D) begn Generate μ Bayesan networks randomly as nt group S(0); Select a network randomly from S(0) as current best network S max ; for each S n S(0) do Ftness[S ]= Cal-Ftness(S ) ; //Calculate Ftness end for ( μ, λ) ES descrbed as follows:

Yungang Zhu et al. / Proceda Engneerng 15 (2011) 4026 4030 4029 S max =Select_Top_One(S(0)) ; S mn =Select_Lowest_One(S(0)) ; t=0; whle ( t< t max Ftness[S max ] Ftness[S mn ]<ε) do for =1 to λ do //Generate λ chldren ndvduals S (t) Random_Select(S(t)) ; S j (t) Random_Select(S(t)) ; Chldren Recombnaton(S (t), S j (t) ) ; // Recombnaton Chldren Mutaton(Chldren ) ; // Mutaton Ftness[Chldren ]=Cal-Ftness(Chldren ) ; end for Select μ ndvduals as next generaton S(t+1) from // Selecton {Chldren,, Chldren λ } accordng to Ftness[Chldren ] (1 λ ) t=t+1; S max =Select_Top_One(S(t)); S mn =Select_Lowest_One(S(t)); end whle return S max ; end 4. Experment and dscusson We use benchmark experment data generated from a classcal Bayesan network called Alarm [6] whch has 37 nodes. In detal, we generate a tranng data set wth 4000 samples, the frst 3000 samples are used for learnng, and the last 1000 samples are used for testng. We make the learnng samples nto 3 groups each of whch contans 1000 samples, 2000 samples, 3000 samples, then we learn 3 Bayesan networks from the 3 groups and test the learnng accuracy separately. The algorthm s evaluated based on the N average Log-Loss of each learned network on ths test set, that s 1 N P C = 1 S ( ) [7], where N s the number of test data, C s the th sample of test data. For convenence, we actually use the absolute value of Log- Loss, t's value can measure how well the learned network ft the data, that s the accuracy of learned network, and the value s the smaller, the better. The results are summarzed n Fgure 2. 18 Absolute value of Log-Loss 17 16 15 14 N=1000 N=2000 N=3000 13 12 0 20 40 60 80 100 Iteraton numbers Fg. 2. The learnng performance of the proposed algorthm wth 3 groups test data.

4030 Yungang Zhu et al. / Proceda Engneerng 15 (2011) 4026 4030 From Fgure 2, we can see the algorthm can converge to a good network, and the more samples used for learnng, the faster algorthm wll converge, and the better Bayesan network wll be obtaned. Because more data can contans more statstcal features, so the learned result wll be more accurate. The expermental shows that the algorthm s effectve. 5. Conclusons In ths paper, a (μ, λ)-evoluton Strategy based learnng algorthm for Bayesan network s proposed. An mproved encodng scheme of Bayesan network structure s proposed, the ftness functon s desgned based on the BIC scorng measure. The recombnaton, mutaton and selecton evolutonary operators are also proposed. Expermental results show that the proposed algorthm can learn the Bayesan network from data effectvely. Acknowledgements Professor Dayou Lu s the correspondng author for ths paper. Ths work s supported by the Natonal Natural Scence Foundaton of Chna (NSFC) under Grant No.60873149, 60973088, 60773099. Ths work s also supported by the Open Projects of Shangha Key Laboratory of Intellgent Informaton Processng n Fudan Unversty under the Grand No. IIPL-09-007. References [1] Daly R, Shen Q, Atken S. Learnng Bayesan networks: approaches and ssues.the Knowledge Engneerng Revew,2011, 26(2), p99 157. [2] Gruttner M, Sehnke F, et al. Mult-Dmensonal Deep Memory Atar-Go Players for Parameter Explorng Polcy Gradents. The 20th Internatonal Conference on Artfcal Neural Networks,2010, p114-123. [3] Zhang C, Shen YD, et al. Structure Learnng of Belef Network by Genetc Algorthms: A New Network Encodng Method. Computer Scence.2004, 31(12),p103-105. [4] Lee J, Chung W,et al.a new genetc approach to structure learnng of Bayesan networks.advances n Neural Networks - ISNN 2006, PT 1 Lecture Notes n Computer Scence 3971, Part 1 2006, p659-668. [5] Zhou CJ, Lang YC. Evolutonary Computaton. 3rd ed. Changchun:Jln Unversty Press;2009. [6] Benlch I, Suermondt G, et al. The ALRAM montorng system: A case study wth two probablstc nference. The 2nd European Conf on A rtfcal Intellgence n Medcne, 1989, p247-256. [7] Fredman N. Learnng belef networks n the presence of mssng values and hdden varables.the 14th Internatonal Conf on Machne Learnng,1997, p125-133.