An Introduction to Bayesian Networks: Concepts and Learning from Data

Size: px

Start display at page:

Download "An Introduction to Bayesian Networks: Concepts and Learning from Data"

Clementine Bridges
6 years ago
Views:

1 n Introducton to ayesan Networks: Concepts and Learnng from Data Kyu-aek wang and young-tak Zhang ontellgence Lab School of Computer Scence and Engneerng Seoul Natonal Unversty

2 Introducton asc Concepts of ayesan Networks Learnng ayesan Networks arameter Learnng Structural Learnng pplcatons DN mcroarray Classfcaton Dependency nalyss Summary c 2005 SNU CSE I Lab 2

3 ayes Rule for robablstc Inference 1/3 Medcal dagnoss based on knowledge base from [Mtchell 97] ror probablty of cancer cancer 0.008, -cancer test for cancer + cancer 0.98, - cancer cancer 0.03, - -cancer 0.97 c 2005 SNU CSE I Lab 3

4 ayes Rule for robablstc Inference 2/3 If the result of the test s postve, how probable s the case of cancer? We should calculate cancer +. robablstc nference from the gven probabltes knowledge. cancer + + cancer cancer / / + c 2005 SNU CSE I Lab 4

5 ayes Rule for robablstc Inference 3/3 Margnalzaton ow to calculate +? cancer + + -cancer + should be one. cancer / / + -cancer + + -cancer -cancer / / / / / cancer / c 2005 SNU CSE I Lab 5

6 ayesan Network for Medcal Dagnoss Causal relatonshp cancer test cancer cancer cancer cancer cancer cancer 0.97 In our example, the knowledge base corresponds to the jont probablty dstrbuton. cancer test cancer test, cancer c 2005 SNU CSE I Lab 6

7 ayesan Networks compact representaton of knowledge base probabltes Qualtatve part: graph theory Drected acyclc graph DG Vertces: varables Edges: dependency or nfluence of a varable on another. Quanttatve part: probablty theory Set of condtonal probabltes for all varables Naturally handles the problem of complexty and uncertanty. c 2005 SNU CSE I Lab 7

8 Jont probablty as a product of condtonal probabltes Can dramatcally reduce the parameters for data modelng n ayesan networks. MINVOLSET 37 varables n total ULMEMOLUS INTUTION KINKEDTUE VENTMC DISCONNECT 509 parameters 2 54 SUNT MINOVL FIO2 VENTLUNG VENTLV RESS VENITUE NYLXIS VST RTCO2 TR SO2 INSUFFNEST EXCO2 YOVOLEMI LVFILURE CTECOL LVEDVOLUME STROEVOLUME ISTORY ERRLOWOUTUT R ERRCUTER CV CW CO REKG RST R From NIS 01 tutoral by Fredman, N. and Daphne K. c 2005 SNU CSE I Lab 8

9 robablstc Graphcal Models Graphcal model undrected graph Markov Random Feld E drected graph ayesan Networks E C D C D c 2005 SNU CSE I Lab 9

10 10 c 2005 SNU CSE I Lab Representaton of Jont robablty,,,,, 1 1,,,, E D C D C E D C ϕ ϕ ϕ ϕ Z C E D Z 3 Z 2 Z 1 normalzaton constant C E D,,,, E E D D C C E D C a a a a a

11 Real World pplcatons of N Intellgent agents Mcrosoft Offce assstant: ayesan user modelng Medcal dagnoss TFINDER eckerman, 1992: dagnoss of lymph node dsease commercalzed as INTELLIT Control decson support system Speech recognton MMs Genome data analyss gene expresson, DN sequence, a combned analyss of heterogeneous data. Turbocodes channel codng c 2005 SNU CSE I Lab 11

12 Introducton asc Concepts of ayesan Networks Learnng ayesan Networks arameter Learnng Structural Learnng pplcatons DN mcroarray Classfcaton Dependency nalyss Summary c 2005 SNU CSE I Lab 12

13 Causal Networks Node: event rc: causal relatonshp between the two nodes : causes. Causal network for the car start problem [Jensen 01] Fuel Clean Spark lugs Fuel Meter Standng Start c 2005 SNU CSE I Lab 13

14 Reasonng wth Causal Networks My car does not start. ncreases the certanty of no fuel and drty spark plugs. ncreases the certanty of fuel meter s standng for the empty. Fuel meter stands for the half. decreases the certanty of no fuel ncreases the certanty of drty spark plugs. Fuel Clean Spark lugs Fuel Meter Standng Start c 2005 SNU CSE I Lab 14

15 d-separaton rule descrbng the nfluences between the nodes. Connectons n causal networks C C C Seral dvergng convergng Defnton [Jensen 01]: Two nodes n a causal network are d-separated f for all paths between them there s an ntermedate node V such that the connecton s seral or dvergng and the state of V s known or the connecton s convergng and nether V nor any of V s descendants have receved evdence. c 2005 SNU CSE I Lab 15

16 d-separaton Example 1 C C C and s margnally dependent and s margnally ndependent C C C and s condtonally ndependent and s condtonally dependent c 2005 SNU CSE I Lab 16

17 d-separaton Example 2 There exsts a non-blocked path. ence, two black nodes varables are not d-separated and possbly dependent on each other. c 2005 SNU CSE I Lab 17

18 d-separaton Example 2 Every path s blocked now. ence, the two black nodes varables are d-separated and ndependent from each other. c 2005 SNU CSE I Lab 18

19 d-separaton: Car Start roblem 1. Start and Fuel are dependent on each other. 2. Start and Clean Spark lugs are dependent on each other. 3. Fuel and Fuel Meter Standng are dependent on each other. 4. Fuel and Clean Spark lugs are condtonally dependent on each other gven the value of Start. 5. Fuel Meter Standng and Start are condtonally ndependent gven the value of Fuel. Clean Spark Fuel lugs Fuel Meter Standng Start c 2005 SNU CSE I Lab 19

20 robablty for Quantfyng Certanty n Causal Networks asc axoms 1 ff s certan. Σ 1 summaton s taken over all possble values of. + ff and are mutually exclusve. d-separaton n probablty calculus Event n the causal network a random varable If and are d-separated, then. and are probablstcally ndependent. c 2005 SNU CSE I Lab 20

21 21 c 2005 SNU CSE I Lab Quanttatve Specfcaton by robablty Calculus Fundamentals Condtonal robablty roduct Rule Chan Rule: a successve applcaton of the product rule.,, n n n n n n n n n n X X X X X X X X X X X X X X X X X X X X X X X X X ,...,,...,,,...,,,...,,,...,,,...,,,...,,

22 Defnton: ayesan Networks ayesan network conssts of the followng. set of n varables X {X 1, X 2,, X n } and a set of drected edges between varables. The varables vertces wth the drected edges form a drected acyclc graph DG structure. Drected cycles are not modeled. To each varable X and ts parents ax, there s attached a condtonal probablty table for X ax. Modelng for contnuous varables s also possble. c 2005 SNU CSE I Lab 22

23 ayesan Network for the Car Start roblem Fu Yes 0.98 CS Yes 0.96 Fuel Clean Spark lugs FMS Fu Fu Yes Fu No Fuel Meter Standng FMS Full FMS alf FMS Empty Start St Fu, CS Fu, CS StartYES Yes, Yes 0.99 Yes, No 0.01 No, Yes 0 No, No 0 StartNo c 2005 SNU CSE I Lab 23

24 The Car Start roblem Revsted 1. No start St No 1 evdence 1 Update the condtonal probabltes Fu St No, CS St No, and FMS St No 2. Fuel meter stands for the half FMS alf 1 evdence 2 Update the condtonal probabltes Fu St No, FMS alf and CS St No, FMS alf. Fuel Clean Spark lugs Fuel Meter Standng Start c 2005 SNU CSE I Lab 24

25 Calculaton of Condtonal robabltes Calculaton of CS St No, FMS alf s as follows. CS St, FMS CS, St, FMS St, FMS Fu Fu, CS Fu, CS, St, FMS Fu, CS, St, FMS Summatons n the above equaton are taken over all possble values of the varables. Calculaton of the condtonal probablty by margnalzaton can be mpossble. c 2005 SNU CSE I Lab 25

26 Intal State Fu, CS, St, and FMS c 2005 SNU CSE I Lab 26

27 No Start Fu St No, CS St No, and FMS St No c 2005 SNU CSE I Lab 27

28 Fuel Meter Stands for alf Fu St No, FMS alf and CS St No, FMS alf c 2005 SNU CSE I Lab 28

29 ayesan Networks: Revsted Defnton graphcal model for the probablstc relatonshps among a set of varables. Compact representaton of jont probablty dstrbutons on the bass of condtonal probabltes. Conssts of the followng. Qualtatve part Quanttatve part set of n varables X {X 1, X 2,, X n } and a set of drected edges between varables. The varables nodes wth the drected edges form a drected acyclc graph DG structure. To each varable X and ts parents ax, a condtonal probablty table for X ax. Modelng for contnuous varables s also possble. c 2005 SNU CSE I Lab 29

30 30 c 2005 SNU CSE I Lab Independence of two events,,, C C C C C Condtonal ndependence Margnal ndependence C C C

31 X{X 1, X 2,, X 10 } ax 5 : the parents of X 5 ChX 5 : the chldren of X 5 X 1 DeX 5 : the descendents of X 5 X 2 Topologcal sort of X X X 3 ax 5 X 4 X 1, X 2, X 3, X 4, X 5, X 6, X 7, X 8, X 9, X 10 Chan rule n a reverse order X 5 X 6 X ChX 5 7 X 8 DeX 5 X 9 X 10 X\{ X }, X X\{ X } X X\{ X } X \{ X } X X X'\{ X }, X X'\{ X } X X'\{ X } X'\{ X } X X X''\{ X }, X X''\{ X } X X''\{ X } X''\{ X } X X, X X X, 1 X10 X a X c 2005 SNU CSE I Lab 31

32 ayesan Network Represents the Jont robablty Dstrbuton y the d-separaton property, the ayesan network over n varables X {X 1, X 2,, X n } represents X as follows: X, X,..., X 1 2 n 1 n X a X. Gven the jont probablty dstrbuton, any condtonal probablty can be calculated n prncple. c 2005 SNU CSE I Lab 32

33 n Illustraton of Condtonal Independence n Ns Netca c 2005 SNU CSE I Lab 33

34 Inference Example X 1 X 2 X 3 X 1 0.6, 0.4 X 2 X 1 X 1 0: 0.2, 0.8 X 1 1: 0.5, 0.5 X 3 X 2 X 2 0: 0.3, 0.7 X 2 1: 0.7, 0.3 c 2005 SNU CSE I Lab 34

35 Intal State X 1 X 2 X 3 X 2 X1, X3 X 1, X 2, X 3 X1, X3 X 1 X 2 X 1 X 3 X 2 X1 X 1 X 2 X 1 X3 X 3 X 2 X1 X 1 X 2 X * 0.2, * 0.5, , , 0.68 X 1 0.6, 0.4 X 2 X 1 X 1 0: 0.2, 0.8 X 1 1: 0.5, 0.5 X 3 X 2 X 2 0: 0.3, 0.7 X 2 1: 0.7, 0.3 c 2005 SNU CSE I Lab 35

36 Gven that X 3 1 X 1 X 2 X 3 X 1 X 3 1 β X 1, X 3 1 β X2 X 1, X 2, X 3 1 β X2 X 1 X 2 X 1 X 3 1 X 2 β X 1 X2 X 2 X 1 X 3 1 X 2 β X * * 0.3, 0.5 * * 0.3 β 0.6, 0.4 * 0.62, 0.5 β 0.228, , 0.47 c 2005 SNU CSE I Lab X 1 0.6, 0.4 X 2 X 1 X 1 0: 0.2, 0.8 X 1 1: 0.5, 0.5 X 3 X 2 X 2 0: 0.3, 0.7 X 2 1: 0.7,

37 Learnng Example X 1 X 2 X 3 Data 22 examples: X 1 : X 2 : X 3 : c 2005 SNU CSE I Lab 37

38 arameter Learnng X 1 X 2 X 3 X 1 13/22, 9/22 X 2 X 1 X 1 0: 7/13, 6/13 X 1 1: 3/9, 6/9 Data 22 examples: X 1 : X 2 : X 3 : c 2005 SNU CSE I Lab 38

39 Introducton asc Concepts of ayesan Networks Learnng ayesan Networks arameter Learnng Structural Learnng pplcatons DN mcroarray Classfcaton Dependency nalyss Summary c 2005 SNU CSE I Lab 39

40 Learnng ayesan Networks Data cquston reprocessng ror knowledge N Structure + Local probablty dstrbuton ayesan network learnng * Structure Search * Score Metrc * arameter Learnng c 2005 SNU CSE I Lab 40

41 Learnng ayesan Networks Cont d ayesan network learnng conssts of Structure learnng DG structure arameter learnng for local probablty dstrbuton Stuatons Known structure and complete data. Unknown structure and complete data. Known structure and ncomplete data. Unknown structure and ncomplete data. c 2005 SNU CSE I Lab 41

42 arameter Learnng Task: Gven a network structure, estmate the parameters of the model from data. C D 0.99 L L 0.07 S1 S2 L C L D L,,, L L, C, L L D L L, L S M L L c 2005 SNU CSE I Lab 42

43 43 c 2005 SNU CSE I Lab Key pont: ndependence of parameter estmaton D{s 1, s 2,, s M }, where s a, b, c, d s an nstance of a random vector varable S,, C, D. ssumpton: samples s are ndependent and dentcally dstrbuted..d. [ ] Θ Θ Θ Θ Θ Θ Θ Θ Θ Θ Θ M G M G M G M G M G G G G M G G G d b, c a,b, b a d b, c a,b, b a D D L ; s C D G Independent parameter estmaton for each node varable

44 C, D C D One can estmate the parameters for,, C,, and D n an ndependent manner. If,, C, and D are all bnary-valued, the number of parameters are reduced from to s 1 s 2 s M a1 a2 a M C D b b b 1 2 M c c c 1 2 M d d d 1 2 M,,C,DxxC,xD c 2005 SNU CSE I Lab 44

45 Methods for arameter Estmaton Maxmum Lkelhood Estmaton Choose the value of Θ whch maxmzes the lkelhood for the observed data D. Θˆ arg max ayesan estmaton Θ L G Θ; D arg max D Θ Represent uncertanty about parameters usng a probablty dstrbuton over Θ. Θ s also a random varable rather than a parameter value. Θ D Θ Θ D Θ D Θ D posteror pror lkelhood Θ c 2005 SNU CSE I Lab 45

46 ayes Rule, M and ML ayes rule D h h h D D h: hypothess models or parameters D: data ML maxmum lkelhood estmaton h* arg max D h h M maxmum a posteror estmaton h* arg max h D ayesan Learnng h D h Not a pont estmate, but the posteror dstrbuton From NIS 99 tutoral by Ghahraman, Z. c 2005 SNU CSE I Lab 46

47 47 c 2005 SNU CSE I Lab ayesan estmaton for multnomal dstrbuton k k k 1 θ θ + k N k k k D D 1 θ θ θ θ [ ] l l l k k k D k M N N E d D d D k D k S 1 θ θ θ θ θ θ θ θ Smoothed verson of MLE s 1 s 2 s 3 s M θ ror ө Lkelhood D ө θ Suffcent statstcs ror knowledge or pseudo counts,,, ~ 2 1 K Dr θ

48 n Example: Con toss Maxmum lkelhood estmaton T 5 1 D θ θ θtθ θ θ θ θ θt 5 θ ˆ arg max θ θ D S ˆ θ ayesan nference θ ~ Dr 1,1 N, N 5, 1 T T θ θ 1 1 θ 1 1 T 5 1 θ D θ D θ θ θt D D θ θ θ 5 θ θ T 1 T θ θ θ θ 0.75 c 2005 SNU CSE I Lab 48

MLE 0.5, 0.5 1, 1 2, 2 5, 5 1.00 0.75 0.67 0.60 0.55 1.00 0.83 0.75 0.67 0.58 T 0.67 0.63 0.60 0.57 0.54 T 0.75 0.70 0.67 0.63 0.57 T 0.80 0.

49 MLE 0.5, 0.5 1, 1 2, 2 5, T T T T T TT , Drchlet pror c 2005 SNU CSE I Lab 49

50 Varaton of posteror dstrbuton for the parameter 0.5, 0.5 1, 1 2, 2 5, 5 c 2005 SNU CSE I Lab 50

51 Introducton asc Concepts of ayesan Networks Learnng ayesan Networks arameter Learnng Structural Learnng Scorng Metrc Search Strategy ractcal pplcatons DN mcroarray Classfcaton Dependency nalyss Summary c 2005 SNU CSE I Lab 51

52 Structural Learnng Task: Gven a data set, search a most plausble network structure underlyng the generaton of the data set. Metrc score-based approach Use a scorng metrc to measure how well a partcular structure fts the observed set of cases. C D S1 L L L C D S2 S M L L C D Scorng metrc Search strategy C D Score c 2005 SNU CSE I Lab 52

53 Scorng Metrc G D G G D G D G D ror for network structure Margnal lkelhood Lkelhood Score Score D G Θ N N G; D log, MLE I X; a 1 1 X Nodes of hgh mutual nformaton dependency wth ther parents get hgher score. Snce, IX;Y IX; {Y, Z}, the fully connected network s obtaned n an unrestrcted case. rone to overfttng. c 2005 SNU CSE I Lab 53

54 Lkelhood Score n Relaton wth Informaton Theory loglθˆ ;D N a X 1 j 1 k 1 N jk log N N jk j X X Y IX;Y Y Y X M N a X 1 j 1 k 1 N jk M log N N jk j M N 1 a X C D M N N X X a X N I X ; a N 1 X C D c 2005 SNU CSE I Lab Empty network 54

55 ayesan Score Consder the uncertanty n parameter estmaton n ayesan network Score G; D G D G, Θ Θ G dθ ssumng a complete data and parameter ndependence, the margnal lkelhood can be rewrtten as N 1 [ ] d D X ; a X G, θ θ G D G θ Margnal lkelhood for each par of X ;ax c 2005 SNU CSE I Lab 55

56 ayesan Drchlet Score For a multnomal case, f we assume a Drchlet pror for each parameter eckerman, 1995, ; a X Γ a X jk jk j 1 Γ j + Nj k 1 Γ jk j G, θ θ G dθ D X,, θ j ~ Dr j1 j2, j X j k X k j N N j k N jk # of X x, a X pa 1 X 1 jk jk Γ + N Γ n + 1 nγ n n! Γ 1 1 Γ x 1 t x t e dt 0 c 2005 SNU CSE I Lab 56

57 57 c 2005 SNU CSE I Lab ayesan Drchlet Score Cont d T T φ + T T T T T T [ ] T 4 Γ + Γ 3 Γ + Γ 1 T T Γ + Γ Γ + Γ Γ + Γ + Γ Γ T T ~, T T Dr θ +

58 58 c 2005 SNU CSE I Lab ayesan Drchlet Score Cont d T T φ + T T T T T T [ ] T 4 Γ + Γ 3 Γ + Γ 1 T T Γ + Γ Γ + Γ Γ + Γ + Γ Γ T T ~, T T Dr θ +

59 ayesan Drchlet Score Cont d N [ ] D X ; a X G, θ θ G d D G θ 1 N a X Γ j Γ + N jk jk 1 j 1 Γ j + Nj k 1 Γ jk log D G n ayesan score s asymptotcally equvalent to IC Schwarz, 1978 and mnus the MDL crteron Rssanen, dm G log D G IC G; D log D Θˆ, G log M 2 dm G # of parameters n G c 2005 SNU CSE I Lab 59

60 Structure Search Gven a data set, a score metrc, and a set of possble structures, Fnd the network structure wth maxmal score. Dscrete optmzaton One can utlze the property of ndependent score for each par of X, ax. N 1 D X ; X G D G a Score G; D log D G N 1 Score X ; a X c 2005 SNU CSE I Lab 60

61 Tree-Structured Network Defnton: Each node has at most one parent. n effectve search algorthm exsts. Improvement over empty network Score G D Score X a + [ Score X a Score X ] Score for empty network Score X Chow and Lu 1968 Construct the undrected complete graph wth the weghts of edge EX, Xj beng IX; Xj. D uld a maxmum weghted spannng tree. C Transform to a drected tree wth an arbtrary root node. c 2005 SNU CSE I Lab 61

62 S1 S2 S M L L.. C L D L L Complete undrected graph I; I;D I;C I;D D I;C C IC;D Maxmum spannng tree C Drected tree C D D c 2005 SNU CSE I Lab 62

63 Search Strateges for General ayesan Networks Wth more than one parents per node N-hard Chckerng, 1996 eurstc search methods are usually employed. Greedy hll-clmbng local search Greedy hll-clmbng wth random restart Smulated annealng Tabu search c 2005 SNU CSE I Lab 63

64 Greedy Local Search lgorthm INUT: Data set, Scorng Metrc, ossble Structures C D Intalze the structure empty network, random network, tree-structured network, etc Do a local edge operaton resultng n the largest mprovement n score among all possble operatons INSERT C D ScoreG t+1 ;D>ScoreG t ;D YES DELETE REVERSE NO G fnal G t C D C D c 2005 SNU CSE I Lab 64

65 Enhanced Search Greedy local search can get stuck n local maxma or plateaux. Standard heurstcs to escape the two ncludes Search wth random restarts, Smulated annealng, Tabu search Genetc algorthm: a populaton-based search. score Local maxmum score Random restart Greedy search Greedy search wth random restarts score score Smulated annealng c 2005 SNU CSE I Lab opulaton-based search G 65

66 Learnng ayesan Networks: Summary Learnng ayesan networks Jont probablty dstrbuton product of condtonal probabltes for each varable node sum n log-based representaton. arameter Learnng Estmaton of local probablty dstrbuton MLE, ayesan estmaton. Structure Learnng Score metrcs Lkelhood, ayesan score, IC, MDL Search strateges Optmzaton n dscrete space maxmze the score Key concept: the decomposablty of the score Tree-structured and general ayesan networks. c 2005 SNU CSE I Lab 66

67 Introducton asc Concepts of ayesan Networks Learnng ayesan Networks arameter Learnng Structural Learnng pplcatons DN mcroarray Classfcaton Naïve ayes, TN Dependency nalyss Summary c 2005 SNU CSE I Lab 67

68 DN Mcroarray Data nalyss Classfcaton Gene expresson data of 72 leukema patents. Task: classfcaton of samples nto ML or LL based on expresson patterns usng ayesan networks. Combned analyss of gene expresson data and drug actvty data Gene expresson data and drug actvty data of 60 cancer samples Task: construct a dependency network of genes and drugs. Tools WEK collecton of machne learnng algorthms for data mnng tasks mplemented n JV, open source software ssued under the GNU General ublc Lcense ayesan network learnng algorthms are also ncluded. NJ software are used for vsualzaton when needed. c 2005 SNU CSE I Lab 68

69 DN Mcroarrays Recent developments n the technology for bologcal experments have made t possble to produce massve bologcal data sets. Montor thousands of gene expresson levels smultaneously. tradtonal one gene experments. parallel vew of the expresson patterns of thousands of genes n a cell. ayesan networks s a useful tool for the dentfcaton of a varety of meanngful relatonshps among genes from the data. c 2005 SNU CSE I Lab 69

ayesan Networks n Mcroarray Data nalyss Sample classfcaton - Dsease dagnoss Gene-gene relaton analyss - ctvaton or nhbton between genes Expresson profles ayesan network constructon c

70 ayesan Networks n Mcroarray Data nalyss Sample classfcaton - Dsease dagnoss Gene-gene relaton analyss - ctvaton or nhbton between genes Expresson profles ayesan network constructon c 2005 SNU CSE I Lab Gene regulatory network analyss - Global vew on the relatons among genes Combned analyss of other bologcal data - DN sequence data, drug actvty data, and so on. 70

71 DN Mcroarray Image analyss c 2005 SNU CSE I Lab 71

72 Data reparaton for Data Mnng Sample 1 Gene 2 Sample 1 Sample 2 Sample Image analyss Sample k Sample M Mcroarray mage samples Numercal data for data mnng c 2005 SNU CSE I Lab 72

Example 1: Tumor Type Classfcaton Task: classfcaton a leukema samples nto two classes based on gene expresson patterns usng ayesan networks Gene Gene D Gene Gene C Target

73 Example 1: Tumor Type Classfcaton Task: classfcaton a leukema samples nto two classes based on gene expresson patterns usng ayesan networks Gene Gene D Gene Gene C Target - DN mcroarray data from two knds of leukema patents - Selected genes and the target varable Gene Gene D Gene Target Gene C - Learned ayesan network c 2005 SNU CSE I Lab 73

74 Data Sets 72 samples n total: 25 ML acute myelod leukema samples + 47 LL acute lymphoblastc leukema samples Golub et al., data sample conssts of expresson measurements for over 7,000 genes. reprocessng 30 nformatve genes were selected by the -metrc score g µ σ ML ML µ + σ LL LL Expresson values were dscretzed nto 2 levels, gh and Low. The fnal data s 72 samples of whch each conssts of gh or Low expresson values of 30 genes c 2005 SNU CSE I Lab 74

75 Classfcaton by ayesan Networks Classfcaton as an nference for a varable node n ayesan networks. ayes optmal classfer c * g arg max arg max c C c C h c g, D h c g, h h D f hˆ D 1 * c g argmax c, h ˆ c C g h ck, g h ck, g h ck g h ck, g g c, g l h l type, g type pc type ltc4s type zyxn type c _ myb type, pc fah type, ltc4s mb 1 type, fah c 2005 SNU CSE I Lab 75

76 Naïve ayes Classfer Very restrcted form of ayesan network Condtonal ndependence of all varables nodes gven the value of the class varable. Though smple, wdely used n classfcaton tasks up to now. type, s type s type s type c _ myb type zyxn type pc type ltc4s type fah type mb _1 type c 2005 SNU CSE I Lab 76

77 Tree-ugmented Network Fredman et al., 1997 Naïve ayes + treestructured network of varables but class node. Express the dependency between varables n a restrcted way. Class s the root node X N can have at most one parent, besdes Class node. zyxn type zyxn type, fah c 2005 SNU CSE I Lab 77

78 Results N TN Leave-one-out cross-valdaton ayesan network learnng wth 71 samples and test for the remanng one samples. 72 teraton 2_NO 2_N 6 genes 30 genes N 66/72 68/72 TN General N 69/72 66/72 68/72 67/72 3_NO 3_N max #pa 2 69/72 N 69/72 N General N 69/72 65/72 max #pa 3 70/72 N 67/72 N c 2005 SNU CSE I Lab 78

$Markov lanket n N Markov lanket of the node type Markov lanket of the node T: T X\{T} T MT MT {at,$

79 Markov lanket n N Markov lanket of the node type Markov lanket of the node T: T X\{T} T MT MT {at, ChT, acht\{t}} X 1 X 3 X 4 X 3 X 4 X 2 X 6 XT 5 X 6 X 7 X 8 X 7 X 8 TN X 9 X 10 c 2005 SNU CSE I Lab 79

80 Example 2: Gene-Drug Dependency nalyss Task: Construct a ayesan network for the combned analyss of gene expresson data and drug actvty data. Gene Expresson Data Drug actvty Data reprocessng - Thresholdng - Dscretzaton Gene Gene Drug Drug Cancer < Learned ayesan network > - Dependency analyss - robablstc nference Gene Drug Gene Drug Cancer - Selected genes, drugs and cancer type node ayesan network learnng c 2005 SNU CSE I Lab 80

81 Data Sets NCI60 data sets Scherf et al., genes and 1400 chemcal compounds from 60 human cancer samples. reprocessng 12 genes and 4 drugs were selected based the correlaton analyss result n Scherf et al. 2000: Consderng the learnng tme and vsualzaton of ayesan networks. Dscretze the gene expresson values and drug actvty values nto 3 levels, gh, Md, Low. Nodes n ayesan networks: 12 genes, 4 drugs, cancer type. c 2005 SNU CSE I Lab 81

Results Vsualzaton by NJ3 Software http://bnj.sourceforge.

82 Results Vsualzaton by NJ3 Software Identfcaton of a plausble dependency among 2 genes and 1 drugs c 2005 SNU CSE I Lab 82

83 Inferrng Gene Regulaton Model Reconstructon of regulatory relaton among genes from genome data. One of the challenges n reconstructon of gene regulatory networks usng ayesan networks s statstcal robustness. rses from the sparse nature underlyng gene expresson data. Usually, the number of samples are much smaller than that of genes attrbutes. Can produce unstable, spurous results. ootstrap, model averagng, ncorporaton of pror bologcal knowledge. c 2005 SNU CSE I Lab 83

84 ootstrap-based approach Multple ayesan networks from bootstrapped samples. Fredman, N. et al., 2000, e er, D. et al., 2001 Expresson Data Yeast cell-cycle data 76 samples for 800 genes Resamplng wth replacement 1 2 L G 1 G 2 G L Fredman et al., 2000 Estmate feature statstcs e.g., gene Y s n the Markov blanket of X 1 L conf f f G 1 L c 2005 SNU CSE I Lab Identfcaton of sgnfcant - parwse relatons Fredman, subnetworks e er,

85 Model averagng-based approach Estmate the confdence of features by averagng over posteror model N dstrbuton ertemnk et al., 2002 Expresson Data Locaton data serves as a pror score Structure learnng wth smulated annealng conf f L G D f DG, G 1 G 2 G L c 2005 SNU CSE I Lab 85

86 Incorporaton pror knowledge or assumpton. Impose bologcally-motvated assumptons on the network structure. Constructon of gene regulaton model from gene expresson data: nferrng regulator set e er, D., et al., 2002 ssumpton relatvely small set of genes s drectly nvolved n transcrptonal regulaton Lmt the number of canddate regulators Lmt the number of parents for each node gene and the number of genes havng outgong edges. Sgnfcantly reduce the number of possble models. Statstcal robustness of the dentfed regulator sets. c 2005 SNU CSE I Lab 86

87 3,622 genes n total Canddate regulator set 456 Gene expresson data Remove nonnformatve genes Dscretzaton 358 samples from combned data sets for S. cerevsae. Learnng wth structure constrants Scoreg;a g Ig;a g R 1 R 2 R 3 R M Regulator set 2-level network e er et al., 2002 Target c 2005 SNU CSE I Lab 87

88 Summary ayesan networks provde an effcent/effectve framework for organzng the body of knowledge by encodng the probablstc relatonshps among varables of nterest. Graph theory + probablty theory: DG + local probablty dstrbuton. Condtonal ndependence and condtonal probablty are keystones. compact way to express complex systems by smpler probablstc modules and thus a natural framework for dealng wth complexty and uncertanty. Two problems n the learnng of ayesan networks from data arameter estmaton: MLE, M, ayesan estmaton Structural learnng: tree-structured network, heurstc search for general ayesan network. c 2005 SNU CSE I Lab 88

89 Summary Cont d Not covered n ths tutoral but mportant ssues. robablstc nference n general ayesan networks Exact & approxmate nference Incomplete data mssng data, hdden varables Known structure & unknown structure Expectaton-Maxmzaton algorthm can be employed as n latent varable models. ayesan model averagng over structures as well as parameters. Dynamc ayesan networks for temporal data. Many applcatons Text and web mnng Medcal dagnoss Intellgent agent system onformatcs c 2005 SNU CSE I Lab 89

90 References ayesan Networks papers & books Chow, C. and Lu, C., pproxmatng dscrete probablty dstrbutons wth dependency trees, IEEE Transactons on Informaton Theory, 14, pp , earl, J., robablstc Reasonng n Intellgent Systems: Networks of lausble Inference, Neapoltan, E., robablstc Reasonng n Expert Systems, Charnak, E., ayesan networks wthout tears, I Magazne, 124:50-63, eckerman, Learnng ayesan networks: the combnaton of knowledge and statstcal data, Machne Leanrng, 20, , Jensen, F.V., n Introducton to ayesan Networks, Sprnger-Verlag, Fredman, N., Geger, D., and Goldszmdt, M., ayesan network classfers, Machne Learnng, 292, pp , Mtchell, T.M., Machne Learnng, The McGraw-ll Companes, Frey,.J., Graphcal Models for Machne Learnng and Dgtal Communcaton, MIT ress, Jordan, M. I. eds., Learnng n Graphcal Models, Kluwer cademc ublshers, Fredman, N. and Goldszmdt, M., Learnng ayesan networks wth local structure, In Learnng n Graphcal Models Ed. Jordan, M. I., pp , MIT ress, eckerman, D., tutoral on learnng wth ayesan networks, In Learnng wth Graphcal Models Ed. Jordan, M. I., pp , MIT ress, J. earl and S. Russel, ayesan networks, UCL Cogntve Systems Laboratory, Techncal Report R- 277, November Sprtes,., Glymour, C., and Schenes, R., Causaton, redcton, and Search, 2 nd edton, MIT ress, Jensen, F. V., ayesan Networks and Decson Graphs, Sprnger, J. earl, ayesan networks, causal nference and knowledge dscovery, UCL Cogntve Systems Laboratory, Techncal Report R-281, March Korb, K.. and Ncholson,.., ayesan rtfcal Intellgence, CRC ress, c 2005 SNU CSE I Lab 90

91 Graphcal models and ayesan networks tutorals & lectures Fredman, N. and Koller, D., Learnng ayesan Networks from Data, Murphy, K., ref Introducton to Graphcal Models and ayesan Networks Ghahraman, Z., robablstc Models for Unsupervsed Learnng, Ghahraman, Z., ayesan Methods for Machne Learnng, Moser,., robablstc Independence Networks I-II, 2/ndex.htm, 2/ndex.htm oet, M., Specal Topcs: elef Networks, c 2005 SNU CSE I Lab 91

92 ayesan network wth applcaton to bonformatcs Murphy, K. and Man, S., Modelng gene expresson data usng dynamc ayesan networks, Techncal. report 1999: Computer Scence Dvson, Unversty of Calforna, erkeley, C Fredman, N., Lnal, M., Nachman, I., and e er, D., Usng ayesan networks to analyze expresson data, Journal of Computatonal ology, 73/4, pp , Ca, D., Delcher,., Kao,., and Kasf, S. Modelng splce stes wth ayes networks, onformatcs, 2000, 162, pp , e er, D., Regev,., Eldan, G., and Fredman, N., Inferrng subnetworks from perturbed expresson profles, onformatcs, 17suppl 1, pp. S215-S224, e er, D., Regev,., and Tanay,., Mnreg: nferrng an actve regulator set, onformatcs, 18suppl 1, pp , artemnk,. J., Gfford, D. K., Jaakkola, T. S., and Young, R.., Combnng locaton and expresson data for prncpled dscovery of genetc regulatory network models, acfc Symposum on ocomputng, 7, pp , Imoto, S., Km, S., Goto, T., buratan, S., Tashro, K., Kuhara, S., and Myano, S., ayesan network and nonparametrc heteroscedastc regresson for nonlnear modelng of genetc network, Journal of onformatcs and Computatonal ology, 12, pp , Ronald, J., et al., ayesan networks approach for predctng proten-proten nteractons from genomc data, Scence 302: , Olga G. Troyanskaya, O. G., Dolnsk, K., Owen,.., ltman, R.., and otsten, D., ayesan framework for combnng heterogeneous data sources for gene functon predcton n Saccharomyces cerevsae, NS, 10014, pp , eer M.. and Tavazoe, S. redctng gene expresson from sequence. Cell, 1172, pp , Fredman, N., Inferrng cellular networks usng probablstc graphcal models, Scence, , pp , comprehensve lst s avalable at c 2005 SNU CSE I Lab 92

93 ayesan network software packages ayes Net Toolbox Kevn Murphy varety of algorthms for learnng and nference n graphcal models wrtten n MTL. WEK ayesan network learnng and classfcaton modules are ncluded among a collecton of machne learnng algorthms wrtten n JV. Detaled lsts and comparson are referred to ayesan network lst n Google ce/elef_networks/software/ Korb, K.. and Ncholson,.., ayesan rtfcal Intellgence, CRC ress, c 2005 SNU CSE I Lab 93

EM and Structure Learning

EM and Structure Learning EM and Structure Learnng Le Song Machne Learnng II: Advanced Topcs CSE 8803ML, Sprng 2012 Partally observed graphcal models Mxture Models N(μ 1, Σ 1 ) Z X N N(μ 2, Σ 2 ) 2 Gaussan mxture model Consder