A New Evolutionary Computation Based Approach for Learning Bayesian Network

Similar documents
The Study of Teaching-learning-based Optimization Algorithm

Power law and dimension of the maximum value for belief distribution with the max Deng entropy

Lecture Notes on Linear Regression

Hidden Markov Models

Sparse Gaussian Processes Using Backward Elimination

Artificial Intelligence Bayesian Networks

International Journal of Mathematical Archive-3(3), 2012, Page: Available online through ISSN

A Network Intrusion Detection Method Based on Improved K-means Algorithm

A Hybrid Variational Iteration Method for Blasius Equation

Comparison of the Population Variance Estimators. of 2-Parameter Exponential Distribution Based on. Multiple Criteria Decision Making Method

Outline. Bayesian Networks: Maximum Likelihood Estimation and Tree Structure Learning. Our Model and Data. Outline

Semi-supervised Classification with Active Query Selection

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

Using Immune Genetic Algorithm to Optimize BP Neural Network and Its Application Peng-fei LIU1,Qun-tai SHEN1 and Jun ZHI2,*

Wavelet chaotic neural networks and their application to continuous function optimization

Bayesian predictive Configural Frequency Analysis

Markov Chain Monte Carlo Lecture 6

Design and Optimization of Fuzzy Controller for Inverse Pendulum System Using Genetic Algorithm

Microwave Diversity Imaging Compression Using Bioinspired

The Order Relation and Trace Inequalities for. Hermitian Operators

Valuated Binary Tree: A New Approach in Study of Integers

A Bayes Algorithm for the Multitask Pattern Recognition Problem Direct Approach

An improved multi-objective evolutionary algorithm based on point of reference

Conjugacy and the Exponential Family

Study of Selective Ensemble Learning Methods Based on Support Vector Machine

Adaptive Consensus Control of Multi-Agent Systems with Large Uncertainty and Time Delays *

Natural Images, Gaussian Mixtures and Dead Leaves Supplementary Material

Space of ML Problems. CSE 473: Artificial Intelligence. Parameter Estimation and Bayesian Networks. Learning Topics

Course 395: Machine Learning - Lectures

COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS

Lecture 3 Stat102, Spring 2007

Improved delay-dependent stability criteria for discrete-time stochastic neural networks with time-varying delays

on the improved Partial Least Squares regression

Unified Subspace Analysis for Face Recognition

A PROBABILITY-DRIVEN SEARCH ALGORITHM FOR SOLVING MULTI-OBJECTIVE OPTIMIZATION PROBLEMS

Regularized Discriminant Analysis for Face Recognition

MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2)

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

Exercises of Chapter 2

An Improved multiple fractal algorithm

A New Scrambling Evaluation Scheme based on Spatial Distribution Entropy and Centroid Difference of Bit-plane

Boostrapaggregating (Bagging)

Appendix B: Resampling Algorithms

Hiding data in images by simple LSB substitution

Discretization of Continuous Attributes in Rough Set Theory and Its Application*

Multigradient for Neural Networks for Equalizers 1

AP Physics 1 & 2 Summer Assignment

Short Term Load Forecasting using an Artificial Neural Network

Solving Nonlinear Differential Equations by a Neural Network Method

Statistics II Final Exam 26/6/18

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

EM and Structure Learning

Turbulence classification of load data by the frequency and severity of wind gusts. Oscar Moñux, DEWI GmbH Kevin Bleibler, DEWI GmbH

Uncertain Models for Bed Allocation

Cryptanalysis of pairing-free certificateless authenticated key agreement protocol

Week 5: Neural Networks

This column is a continuation of our previous column

THEORY OF GENETIC ALGORITHMS WITH α-selection. André Neubauer

Hongyi Miao, College of Science, Nanjing Forestry University, Nanjing ,China. (Received 20 June 2013, accepted 11 March 2014) I)ϕ (k)

Polymerizer fault diagnosis algorithm based on improved the GA-LMBP

Chapter 8 Indicator Variables

Pop-Click Noise Detection Using Inter-Frame Correlation for Improved Portable Auditory Sensing

Supporting Information

Feature Selection in Multi-instance Learning

Checking Pairwise Relationships. Lecture 19 Biostatistics 666

STAT 3008 Applied Regression Analysis

A Trust Model Based on Cloud Model and Bayesian Networks

Particle Swarm Optimization with Adaptive Mutation in Local Best of Particles

Singular Value Decomposition: Theory and Applications

Differential Evolution Algorithm with a Modified Archiving-based Adaptive Tradeoff Model for Optimal Power Flow

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

Bayesian belief networks

COMPOSITE BEAM WITH WEAK SHEAR CONNECTION SUBJECTED TO THERMAL LOAD

Problem Set 9 Solutions

Generalized Linear Methods

Fuzzy Boundaries of Sample Selection Model

An Extended Hybrid Genetic Algorithm for Exploring a Large Search Space

MODIFIED PARTICLE SWARM OPTIMIZATION FOR OPTIMIZATION PROBLEMS

Modeling of Risk Treatment Measurement Model under Four Clusters Standards (ISO 9001, 14001, 27001, OHSAS 18001)

Chapter 11: Simple Linear Regression and Correlation

Composite Hypotheses testing

Online Classification: Perceptron and Winnow

MDL-Based Unsupervised Attribute Ranking

Motion Perception Under Uncertainty. Hongjing Lu Department of Psychology University of Hong Kong

LINEAR REGRESSION ANALYSIS. MODULE VIII Lecture Indicator Variables

Elitist Reconstruction Genetic Algorithm Based on Markov Random Field for Magnetic Resonance Image Segmentation

Multi-Robot Formation Control Based on Leader-Follower Optimized by the IGA

Multilayer Perceptron (MLP)

An adaptive SMC scheme for ABC. Bayesian Computation (ABC)

Natural Language Processing and Information Retrieval

Calculating the Quasi-static Pressures of Confined Explosions Considering Chemical Reactions under the Constant Entropy Assumption

MLE and Bayesian Estimation. Jie Tang Department of Computer Science & Technology Tsinghua University 2012

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

10-701/ Machine Learning, Fall 2005 Homework 3

Operating conditions of a mine fan under conditions of variable resistance

A DNA Coding Scheme for Searching Stable Solutions

A Method for Filling up the Missed Data in Information Table

Module 9. Lecture 6. Duality in Assignment Problems

Case Study of Cascade Reliability with weibull Distribution

A Study on Improved Cockroach Swarm Optimization Algorithm

Transcription:

Avalable onlne at www.scencedrect.com Proceda Engneerng 15 (2011) 4026 4030 Advanced n Control Engneerng and Informaton Scence A New Evolutonary Computaton Based Approach for Learnng Bayesan Network Yungang Zhu a,b, Dayou Lu a,b *,Hayang Ja a,b a College of Computer Scence and Technology, Jln Unversty, Changchun 130012, Chna b Key Laboratory of Symbolc Computaton and Knowledge Engneerng of Mnstry of Educaton, Jln Unversty, Changchun 130012, Chna Abstract Bayesan network s a popular tool for uncertanty process n Artfcal Intellgence. In recent years, more and more attenton has been pad to learnng of Bayesan network. In ths paper, we proposed a novel learnng algorthm for Bayesan network based on (μ, λ)-evoluton Strategy, we present the encodng scheme and ftness functon, desgned the evolutonary operators of recombnaton, mutaton and selecton. Theoretcal analyss and expermental results all demonstrate that the proposed method can learn the Bayesan network from data effectvely. 2011 Publshed by Elsever Ltd. Selecton and/or peer-revew under responsblty of [CEIS 2011] Open access under CC BY-NC-ND lcense. Keywords: Bayesan network; Evoluton strategy; Evolutonary computaton 1. Introducton Bayesan network has been a powerful tool for managng uncertanty. It has been successfully appled to expert system, dagnoss system, and decson support system et al. Bayesan network ntegrates graphcal model and probablty theory, and t ndcates the nternal relatonshp among varables. In recent years, many researchers pay much attenton to the learnng algorthm for Bayesan network. Learnng the structure of a Bayesan network can be consdered a specfc example of the general problem of selectng a probablstc model that explans a gven set of data[1]. * Correspondng author. E-mal address: dylu@jlu.edu.cn. 1877-7058 2011 Publshed by Elsever Ltd. do:10.1016/j.proeng.2011.08.755 Open access under CC BY-NC-ND lcense.

Yungang Zhu et al. / Proceda Engneerng 15 (2011) 4026 4030 4027 In ths paper we proposed a novel learnng algorthm for Bayesan network based on Evoluton Strategy. In detal, we adopt (μ, λ)-evoluton Strategy whch s an evolutonary computng method to learn Bayesan network. The rest of ths paper s organzed as follows: Secton 2 presents bref background knowledge. The method we proposed s descrbed n Secton 3. Secton 4 presents the expermental results and dscusson of the proposed method. Fnally, conclusons are summarzed n Secton 5. 2. Theoretcal Background A Bayesan network (BN) s a graphcal model for representng relatonshps among varables. Let us consder a set of varables X={X 1,X 2,,X n }, a Bayesan network s a tuple (G, Θ), where G s a drected acyclc graph (DAG), each node of G represents the varable, and each drected edge represents relatonshps between varables; and Θ={ P(X π ),1 n } represents the local condtonal probablty dstrbuton of each node gven the values of ther parent nodes, where π s the parent set of X. 1 r Assumed the range of X s { 1 q x,..., x }, the range of π s{ π,..., π }, the local condtonal probablty k dstrbuton of each node s represented by j n q r θ jk = P ( X = x π = π ), and Θ = U U U { θ. 1 j 1 k 1 jk} = = = Evoluton Strategy (ES) s one of the evolutonary computng methods. ES produces consecutve generatons of ndvdual, durng a generaton a selecton method s used to select specfc ndvduals whch form the new generaton by recombnaton and mutaton [2]. 3. Learnng Bayesan network based on Evoluton Strategy The problem of learnng of Bayesan network can be stated as follows. Assumng that D represent the data, the purpose s to obtan a Bayesan network S that best ft the D. 3.1. Encodng The Bayesan network structure was encoded nto adjacency matrx or adjacency lst n prevous codng way, but the method wll result n a large amount of cyclc graphs whch are llegal structures. Based on the codng scheme n [3][4], n ths paper, the code s dvded nto 3 parts. The 1st part s a sequence of nodes: Ths order s the reverse of topologcal sort of the network nodes, so there s no cycle. For example, the sequence of Fgure 1 s 54231. Fg. 1. A smple example of Bayesan network. The 2nd part has n-1 segments, each segment ndcate the parents of each node n the sequence above, the last node has no parent, so only n-1 segments needed. For example, the structure of Fgure 1, the second part of code s 1110 111 10 0. segment 1 ndcate that for node 5, node 4,2,3 s parent, and Node 1

4028 Yungang Zhu et al. / Proceda Engneerng 15 (2011) 4026 4030 s not parent; segment 2 ndcate that for node 4, node 2,3,1 s parent; segment 3 ndcate for Node 2, Node 3 s parent, and Node 1 s not; and so on. Further more, consderng smple of network structure, we restrct the number of parents of each node can not exceed k. In general, k s much smaller than n, and usng the code above wll nclude lots of 0. So we use compress form of the code, that s only recordng the poston of 1 n the order of sequence from frst part, thus the fnal code s [234 345 4 0]. The 3rd Part s adaptve step sze n mutaton evolutonary strategy σ. So n summary, the code of Fgure 1 s [54231 234 345 4 0 σ]. 3.2. Ftness functon We use Bayesan Informaton Crteron (BIC)[1] scorng measure to be the Ftness functon, that s as follows: 1 Ftness ( S) = log P( D S) Pen( S) = ( Njk logθjk ) log L π ( X 1) j k 2 k j where s number of sample whch X = and π = n data D. L s the number of samples. X s N jk X the number of assgnment of X. and P ( D S ) measures the ftness of S to data D, Pen(S) s penalty functon about the structure of S to make the learnng algorthm trend to obtan concse model whch s easy for management. 3.3. Evolutonary operator Recombnaton The Recombnaton of evoluton strategy s equvalent of the cross for genetc algorthm. But unlke GA, the Recombnaton generates only one ndvdual from two parent ndvduals. For the frst part of the code, we use Partally Matched Crossover of GA[5], selectng a new ndvdual from the two resultng ndvduals randomly. For the second part of the code, for each segment, a segment s selected randomly from the two parent ndvduals as the segment of the chld ndvdual. The thrd part s the sze of step n Mutaton, usng the md-value for Recombnaton. If the thrd part of two parents are σ 1 and σ 2, after Recombnaton, t s (σ 1 +σ 1 )/2. Mutaton There are three types of mutaton operator: the addng an arc, deletng an arc, and reversng an arc. It s not to perform one mutaton operator durng mutaton, but to perform σ N(0,1) mutaton operators, N(0,1) s normally dstrbuted random varable whch mean=0 and varance=1. Selecton Selecton s strctly accordng to the ftness, elmnatng all the poor ndvduals, selectng all the good ones. The proposed algorthm usng (μ, λ) selecton strategy: μ parent ndvduals generate λ (λ>μ) chldren ndvduals, and select μ ndvduals from the resultng λ ndvduals as the next generaton. π In summary, the pseudo-code of the learnng algorthm based on Procedure ESBN (data D) begn Generate μ Bayesan networks randomly as nt group S(0); Select a network randomly from S(0) as current best network S max ; for each S n S(0) do Ftness[S ]= Cal-Ftness(S ) ; //Calculate Ftness end for ( μ, λ) ES descrbed as follows:

Yungang Zhu et al. / Proceda Engneerng 15 (2011) 4026 4030 4029 S max =Select_Top_One(S(0)) ; S mn =Select_Lowest_One(S(0)) ; t=0; whle ( t< t max Ftness[S max ] Ftness[S mn ]<ε) do for =1 to λ do //Generate λ chldren ndvduals S (t) Random_Select(S(t)) ; S j (t) Random_Select(S(t)) ; Chldren Recombnaton(S (t), S j (t) ) ; // Recombnaton Chldren Mutaton(Chldren ) ; // Mutaton Ftness[Chldren ]=Cal-Ftness(Chldren ) ; end for Select μ ndvduals as next generaton S(t+1) from // Selecton {Chldren,, Chldren λ } accordng to Ftness[Chldren ] (1 λ ) t=t+1; S max =Select_Top_One(S(t)); S mn =Select_Lowest_One(S(t)); end whle return S max ; end 4. Experment and dscusson We use benchmark experment data generated from a classcal Bayesan network called Alarm [6] whch has 37 nodes. In detal, we generate a tranng data set wth 4000 samples, the frst 3000 samples are used for learnng, and the last 1000 samples are used for testng. We make the learnng samples nto 3 groups each of whch contans 1000 samples, 2000 samples, 3000 samples, then we learn 3 Bayesan networks from the 3 groups and test the learnng accuracy separately. The algorthm s evaluated based on the N average Log-Loss of each learned network on ths test set, that s 1 N P C = 1 S ( ) [7], where N s the number of test data, C s the th sample of test data. For convenence, we actually use the absolute value of Log- Loss, t's value can measure how well the learned network ft the data, that s the accuracy of learned network, and the value s the smaller, the better. The results are summarzed n Fgure 2. 18 Absolute value of Log-Loss 17 16 15 14 N=1000 N=2000 N=3000 13 12 0 20 40 60 80 100 Iteraton numbers Fg. 2. The learnng performance of the proposed algorthm wth 3 groups test data.

4030 Yungang Zhu et al. / Proceda Engneerng 15 (2011) 4026 4030 From Fgure 2, we can see the algorthm can converge to a good network, and the more samples used for learnng, the faster algorthm wll converge, and the better Bayesan network wll be obtaned. Because more data can contans more statstcal features, so the learned result wll be more accurate. The expermental shows that the algorthm s effectve. 5. Conclusons In ths paper, a (μ, λ)-evoluton Strategy based learnng algorthm for Bayesan network s proposed. An mproved encodng scheme of Bayesan network structure s proposed, the ftness functon s desgned based on the BIC scorng measure. The recombnaton, mutaton and selecton evolutonary operators are also proposed. Expermental results show that the proposed algorthm can learn the Bayesan network from data effectvely. Acknowledgements Professor Dayou Lu s the correspondng author for ths paper. Ths work s supported by the Natonal Natural Scence Foundaton of Chna (NSFC) under Grant No.60873149, 60973088, 60773099. Ths work s also supported by the Open Projects of Shangha Key Laboratory of Intellgent Informaton Processng n Fudan Unversty under the Grand No. IIPL-09-007. References [1] Daly R, Shen Q, Atken S. Learnng Bayesan networks: approaches and ssues.the Knowledge Engneerng Revew,2011, 26(2), p99 157. [2] Gruttner M, Sehnke F, et al. Mult-Dmensonal Deep Memory Atar-Go Players for Parameter Explorng Polcy Gradents. The 20th Internatonal Conference on Artfcal Neural Networks,2010, p114-123. [3] Zhang C, Shen YD, et al. Structure Learnng of Belef Network by Genetc Algorthms: A New Network Encodng Method. Computer Scence.2004, 31(12),p103-105. [4] Lee J, Chung W,et al.a new genetc approach to structure learnng of Bayesan networks.advances n Neural Networks - ISNN 2006, PT 1 Lecture Notes n Computer Scence 3971, Part 1 2006, p659-668. [5] Zhou CJ, Lang YC. Evolutonary Computaton. 3rd ed. Changchun:Jln Unversty Press;2009. [6] Benlch I, Suermondt G, et al. The ALRAM montorng system: A case study wth two probablstc nference. The 2nd European Conf on A rtfcal Intellgence n Medcne, 1989, p247-256. [7] Fredman N. Learnng belef networks n the presence of mssng values and hdden varables.the 14th Internatonal Conf on Machne Learnng,1997, p125-133.