An Introduction to Bayesian Networks: Concepts and Learning from Data

Size: px
Start display at page:

Download "An Introduction to Bayesian Networks: Concepts and Learning from Data"

Transcription

1 n Introducton to ayesan Networks: Concepts and Learnng from Data Kyu-aek wang and young-tak Zhang ontellgence Lab School of Computer Scence and Engneerng Seoul Natonal Unversty

2 Introducton asc Concepts of ayesan Networks Learnng ayesan Networks arameter Learnng Structural Learnng pplcatons DN mcroarray Classfcaton Dependency nalyss Summary c 2005 SNU CSE I Lab 2

3 ayes Rule for robablstc Inference 1/3 Medcal dagnoss based on knowledge base from [Mtchell 97] ror probablty of cancer cancer 0.008, -cancer test for cancer + cancer 0.98, - cancer cancer 0.03, - -cancer 0.97 c 2005 SNU CSE I Lab 3

4 ayes Rule for robablstc Inference 2/3 If the result of the test s postve, how probable s the case of cancer? We should calculate cancer +. robablstc nference from the gven probabltes knowledge. cancer + + cancer cancer / / + c 2005 SNU CSE I Lab 4

5 ayes Rule for robablstc Inference 3/3 Margnalzaton ow to calculate +? cancer + + -cancer + should be one. cancer / / + -cancer + + -cancer -cancer / / / / / cancer / c 2005 SNU CSE I Lab 5

6 ayesan Network for Medcal Dagnoss Causal relatonshp cancer test cancer cancer cancer cancer cancer cancer 0.97 In our example, the knowledge base corresponds to the jont probablty dstrbuton. cancer test cancer test, cancer c 2005 SNU CSE I Lab 6

7 ayesan Networks compact representaton of knowledge base probabltes Qualtatve part: graph theory Drected acyclc graph DG Vertces: varables Edges: dependency or nfluence of a varable on another. Quanttatve part: probablty theory Set of condtonal probabltes for all varables Naturally handles the problem of complexty and uncertanty. c 2005 SNU CSE I Lab 7

8 Jont probablty as a product of condtonal probabltes Can dramatcally reduce the parameters for data modelng n ayesan networks. MINVOLSET 37 varables n total ULMEMOLUS INTUTION KINKEDTUE VENTMC DISCONNECT 509 parameters 2 54 SUNT MINOVL FIO2 VENTLUNG VENTLV RESS VENITUE NYLXIS VST RTCO2 TR SO2 INSUFFNEST EXCO2 YOVOLEMI LVFILURE CTECOL LVEDVOLUME STROEVOLUME ISTORY ERRLOWOUTUT R ERRCUTER CV CW CO REKG RST R From NIS 01 tutoral by Fredman, N. and Daphne K. c 2005 SNU CSE I Lab 8

9 robablstc Graphcal Models Graphcal model undrected graph Markov Random Feld E drected graph ayesan Networks E C D C D c 2005 SNU CSE I Lab 9

10 10 c 2005 SNU CSE I Lab Representaton of Jont robablty,,,,, 1 1,,,, E D C D C E D C ϕ ϕ ϕ ϕ Z C E D Z 3 Z 2 Z 1 normalzaton constant C E D,,,, E E D D C C E D C a a a a a

11 Real World pplcatons of N Intellgent agents Mcrosoft Offce assstant: ayesan user modelng Medcal dagnoss TFINDER eckerman, 1992: dagnoss of lymph node dsease commercalzed as INTELLIT Control decson support system Speech recognton MMs Genome data analyss gene expresson, DN sequence, a combned analyss of heterogeneous data. Turbocodes channel codng c 2005 SNU CSE I Lab 11

12 Introducton asc Concepts of ayesan Networks Learnng ayesan Networks arameter Learnng Structural Learnng pplcatons DN mcroarray Classfcaton Dependency nalyss Summary c 2005 SNU CSE I Lab 12

13 Causal Networks Node: event rc: causal relatonshp between the two nodes : causes. Causal network for the car start problem [Jensen 01] Fuel Clean Spark lugs Fuel Meter Standng Start c 2005 SNU CSE I Lab 13

14 Reasonng wth Causal Networks My car does not start. ncreases the certanty of no fuel and drty spark plugs. ncreases the certanty of fuel meter s standng for the empty. Fuel meter stands for the half. decreases the certanty of no fuel ncreases the certanty of drty spark plugs. Fuel Clean Spark lugs Fuel Meter Standng Start c 2005 SNU CSE I Lab 14

15 d-separaton rule descrbng the nfluences between the nodes. Connectons n causal networks C C C Seral dvergng convergng Defnton [Jensen 01]: Two nodes n a causal network are d-separated f for all paths between them there s an ntermedate node V such that the connecton s seral or dvergng and the state of V s known or the connecton s convergng and nether V nor any of V s descendants have receved evdence. c 2005 SNU CSE I Lab 15

16 d-separaton Example 1 C C C and s margnally dependent and s margnally ndependent C C C and s condtonally ndependent and s condtonally dependent c 2005 SNU CSE I Lab 16

17 d-separaton Example 2 There exsts a non-blocked path. ence, two black nodes varables are not d-separated and possbly dependent on each other. c 2005 SNU CSE I Lab 17

18 d-separaton Example 2 Every path s blocked now. ence, the two black nodes varables are d-separated and ndependent from each other. c 2005 SNU CSE I Lab 18

19 d-separaton: Car Start roblem 1. Start and Fuel are dependent on each other. 2. Start and Clean Spark lugs are dependent on each other. 3. Fuel and Fuel Meter Standng are dependent on each other. 4. Fuel and Clean Spark lugs are condtonally dependent on each other gven the value of Start. 5. Fuel Meter Standng and Start are condtonally ndependent gven the value of Fuel. Clean Spark Fuel lugs Fuel Meter Standng Start c 2005 SNU CSE I Lab 19

20 robablty for Quantfyng Certanty n Causal Networks asc axoms 1 ff s certan. Σ 1 summaton s taken over all possble values of. + ff and are mutually exclusve. d-separaton n probablty calculus Event n the causal network a random varable If and are d-separated, then. and are probablstcally ndependent. c 2005 SNU CSE I Lab 20

21 21 c 2005 SNU CSE I Lab Quanttatve Specfcaton by robablty Calculus Fundamentals Condtonal robablty roduct Rule Chan Rule: a successve applcaton of the product rule.,, n n n n n n n n n n X X X X X X X X X X X X X X X X X X X X X X X X X ,...,,...,,,...,,,...,,,...,,,...,,,...,,

22 Defnton: ayesan Networks ayesan network conssts of the followng. set of n varables X {X 1, X 2,, X n } and a set of drected edges between varables. The varables vertces wth the drected edges form a drected acyclc graph DG structure. Drected cycles are not modeled. To each varable X and ts parents ax, there s attached a condtonal probablty table for X ax. Modelng for contnuous varables s also possble. c 2005 SNU CSE I Lab 22

23 ayesan Network for the Car Start roblem Fu Yes 0.98 CS Yes 0.96 Fuel Clean Spark lugs FMS Fu Fu Yes Fu No Fuel Meter Standng FMS Full FMS alf FMS Empty Start St Fu, CS Fu, CS StartYES Yes, Yes 0.99 Yes, No 0.01 No, Yes 0 No, No 0 StartNo c 2005 SNU CSE I Lab 23

24 The Car Start roblem Revsted 1. No start St No 1 evdence 1 Update the condtonal probabltes Fu St No, CS St No, and FMS St No 2. Fuel meter stands for the half FMS alf 1 evdence 2 Update the condtonal probabltes Fu St No, FMS alf and CS St No, FMS alf. Fuel Clean Spark lugs Fuel Meter Standng Start c 2005 SNU CSE I Lab 24

25 Calculaton of Condtonal robabltes Calculaton of CS St No, FMS alf s as follows. CS St, FMS CS, St, FMS St, FMS Fu Fu, CS Fu, CS, St, FMS Fu, CS, St, FMS Summatons n the above equaton are taken over all possble values of the varables. Calculaton of the condtonal probablty by margnalzaton can be mpossble. c 2005 SNU CSE I Lab 25

26 Intal State Fu, CS, St, and FMS c 2005 SNU CSE I Lab 26

27 No Start Fu St No, CS St No, and FMS St No c 2005 SNU CSE I Lab 27

28 Fuel Meter Stands for alf Fu St No, FMS alf and CS St No, FMS alf c 2005 SNU CSE I Lab 28

29 ayesan Networks: Revsted Defnton graphcal model for the probablstc relatonshps among a set of varables. Compact representaton of jont probablty dstrbutons on the bass of condtonal probabltes. Conssts of the followng. Qualtatve part Quanttatve part set of n varables X {X 1, X 2,, X n } and a set of drected edges between varables. The varables nodes wth the drected edges form a drected acyclc graph DG structure. To each varable X and ts parents ax, a condtonal probablty table for X ax. Modelng for contnuous varables s also possble. c 2005 SNU CSE I Lab 29

30 30 c 2005 SNU CSE I Lab Independence of two events,,, C C C C C Condtonal ndependence Margnal ndependence C C C

31 X{X 1, X 2,, X 10 } ax 5 : the parents of X 5 ChX 5 : the chldren of X 5 X 1 DeX 5 : the descendents of X 5 X 2 Topologcal sort of X X X 3 ax 5 X 4 X 1, X 2, X 3, X 4, X 5, X 6, X 7, X 8, X 9, X 10 Chan rule n a reverse order X 5 X 6 X ChX 5 7 X 8 DeX 5 X 9 X 10 X\{ X }, X X\{ X } X X\{ X } X \{ X } X X X'\{ X }, X X'\{ X } X X'\{ X } X'\{ X } X X X''\{ X }, X X''\{ X } X X''\{ X } X''\{ X } X X, X X X, 1 X10 X a X c 2005 SNU CSE I Lab 31

32 ayesan Network Represents the Jont robablty Dstrbuton y the d-separaton property, the ayesan network over n varables X {X 1, X 2,, X n } represents X as follows: X, X,..., X 1 2 n 1 n X a X. Gven the jont probablty dstrbuton, any condtonal probablty can be calculated n prncple. c 2005 SNU CSE I Lab 32

33 n Illustraton of Condtonal Independence n Ns Netca c 2005 SNU CSE I Lab 33

34 Inference Example X 1 X 2 X 3 X 1 0.6, 0.4 X 2 X 1 X 1 0: 0.2, 0.8 X 1 1: 0.5, 0.5 X 3 X 2 X 2 0: 0.3, 0.7 X 2 1: 0.7, 0.3 c 2005 SNU CSE I Lab 34

35 Intal State X 1 X 2 X 3 X 2 X1, X3 X 1, X 2, X 3 X1, X3 X 1 X 2 X 1 X 3 X 2 X1 X 1 X 2 X 1 X3 X 3 X 2 X1 X 1 X 2 X * 0.2, * 0.5, , , 0.68 X 1 0.6, 0.4 X 2 X 1 X 1 0: 0.2, 0.8 X 1 1: 0.5, 0.5 X 3 X 2 X 2 0: 0.3, 0.7 X 2 1: 0.7, 0.3 c 2005 SNU CSE I Lab 35

36 Gven that X 3 1 X 1 X 2 X 3 X 1 X 3 1 β X 1, X 3 1 β X2 X 1, X 2, X 3 1 β X2 X 1 X 2 X 1 X 3 1 X 2 β X 1 X2 X 2 X 1 X 3 1 X 2 β X * * 0.3, 0.5 * * 0.3 β 0.6, 0.4 * 0.62, 0.5 β 0.228, , 0.47 c 2005 SNU CSE I Lab X 1 0.6, 0.4 X 2 X 1 X 1 0: 0.2, 0.8 X 1 1: 0.5, 0.5 X 3 X 2 X 2 0: 0.3, 0.7 X 2 1: 0.7,

37 Learnng Example X 1 X 2 X 3 Data 22 examples: X 1 : X 2 : X 3 : c 2005 SNU CSE I Lab 37

38 arameter Learnng X 1 X 2 X 3 X 1 13/22, 9/22 X 2 X 1 X 1 0: 7/13, 6/13 X 1 1: 3/9, 6/9 Data 22 examples: X 1 : X 2 : X 3 : c 2005 SNU CSE I Lab 38

39 Introducton asc Concepts of ayesan Networks Learnng ayesan Networks arameter Learnng Structural Learnng pplcatons DN mcroarray Classfcaton Dependency nalyss Summary c 2005 SNU CSE I Lab 39

40 Learnng ayesan Networks Data cquston reprocessng ror knowledge N Structure + Local probablty dstrbuton ayesan network learnng * Structure Search * Score Metrc * arameter Learnng c 2005 SNU CSE I Lab 40

41 Learnng ayesan Networks Cont d ayesan network learnng conssts of Structure learnng DG structure arameter learnng for local probablty dstrbuton Stuatons Known structure and complete data. Unknown structure and complete data. Known structure and ncomplete data. Unknown structure and ncomplete data. c 2005 SNU CSE I Lab 41

42 arameter Learnng Task: Gven a network structure, estmate the parameters of the model from data. C D 0.99 L L 0.07 S1 S2 L C L D L,,, L L, C, L L D L L, L S M L L c 2005 SNU CSE I Lab 42

43 43 c 2005 SNU CSE I Lab Key pont: ndependence of parameter estmaton D{s 1, s 2,, s M }, where s a, b, c, d s an nstance of a random vector varable S,, C, D. ssumpton: samples s are ndependent and dentcally dstrbuted..d. [ ] Θ Θ Θ Θ Θ Θ Θ Θ Θ Θ Θ M G M G M G M G M G G G G M G G G d b, c a,b, b a d b, c a,b, b a D D L ; s C D G Independent parameter estmaton for each node varable

44 C, D C D One can estmate the parameters for,, C,, and D n an ndependent manner. If,, C, and D are all bnary-valued, the number of parameters are reduced from to s 1 s 2 s M a1 a2 a M C D b b b 1 2 M c c c 1 2 M d d d 1 2 M,,C,DxxC,xD c 2005 SNU CSE I Lab 44

45 Methods for arameter Estmaton Maxmum Lkelhood Estmaton Choose the value of Θ whch maxmzes the lkelhood for the observed data D. Θˆ arg max ayesan estmaton Θ L G Θ; D arg max D Θ Represent uncertanty about parameters usng a probablty dstrbuton over Θ. Θ s also a random varable rather than a parameter value. Θ D Θ Θ D Θ D Θ D posteror pror lkelhood Θ c 2005 SNU CSE I Lab 45

46 ayes Rule, M and ML ayes rule D h h h D D h: hypothess models or parameters D: data ML maxmum lkelhood estmaton h* arg max D h h M maxmum a posteror estmaton h* arg max h D ayesan Learnng h D h Not a pont estmate, but the posteror dstrbuton From NIS 99 tutoral by Ghahraman, Z. c 2005 SNU CSE I Lab 46

47 47 c 2005 SNU CSE I Lab ayesan estmaton for multnomal dstrbuton k k k 1 θ θ + k N k k k D D 1 θ θ θ θ [ ] l l l k k k D k M N N E d D d D k D k S 1 θ θ θ θ θ θ θ θ Smoothed verson of MLE s 1 s 2 s 3 s M θ ror ө Lkelhood D ө θ Suffcent statstcs ror knowledge or pseudo counts,,, ~ 2 1 K Dr θ

48 n Example: Con toss Maxmum lkelhood estmaton T 5 1 D θ θ θtθ θ θ θ θ θt 5 θ ˆ arg max θ θ D S ˆ θ ayesan nference θ ~ Dr 1,1 N, N 5, 1 T T θ θ 1 1 θ 1 1 T 5 1 θ D θ D θ θ θt D D θ θ θ 5 θ θ T 1 T θ θ θ θ 0.75 c 2005 SNU CSE I Lab 48

49 MLE 0.5, 0.5 1, 1 2, 2 5, T T T T T TT , Drchlet pror c 2005 SNU CSE I Lab 49

50 Varaton of posteror dstrbuton for the parameter 0.5, 0.5 1, 1 2, 2 5, 5 c 2005 SNU CSE I Lab 50

51 Introducton asc Concepts of ayesan Networks Learnng ayesan Networks arameter Learnng Structural Learnng Scorng Metrc Search Strategy ractcal pplcatons DN mcroarray Classfcaton Dependency nalyss Summary c 2005 SNU CSE I Lab 51

52 Structural Learnng Task: Gven a data set, search a most plausble network structure underlyng the generaton of the data set. Metrc score-based approach Use a scorng metrc to measure how well a partcular structure fts the observed set of cases. C D S1 L L L C D S2 S M L L C D Scorng metrc Search strategy C D Score c 2005 SNU CSE I Lab 52

53 Scorng Metrc G D G G D G D G D ror for network structure Margnal lkelhood Lkelhood Score Score D G Θ N N G; D log, MLE I X; a 1 1 X Nodes of hgh mutual nformaton dependency wth ther parents get hgher score. Snce, IX;Y IX; {Y, Z}, the fully connected network s obtaned n an unrestrcted case. rone to overfttng. c 2005 SNU CSE I Lab 53

54 Lkelhood Score n Relaton wth Informaton Theory loglθˆ ;D N a X 1 j 1 k 1 N jk log N N jk j X X Y IX;Y Y Y X M N a X 1 j 1 k 1 N jk M log N N jk j M N 1 a X C D M N N X X a X N I X ; a N 1 X C D c 2005 SNU CSE I Lab Empty network 54

55 ayesan Score Consder the uncertanty n parameter estmaton n ayesan network Score G; D G D G, Θ Θ G dθ ssumng a complete data and parameter ndependence, the margnal lkelhood can be rewrtten as N 1 [ ] d D X ; a X G, θ θ G D G θ Margnal lkelhood for each par of X ;ax c 2005 SNU CSE I Lab 55

56 ayesan Drchlet Score For a multnomal case, f we assume a Drchlet pror for each parameter eckerman, 1995, ; a X Γ a X jk jk j 1 Γ j + Nj k 1 Γ jk j G, θ θ G dθ D X,, θ j ~ Dr j1 j2, j X j k X k j N N j k N jk # of X x, a X pa 1 X 1 jk jk Γ + N Γ n + 1 nγ n n! Γ 1 1 Γ x 1 t x t e dt 0 c 2005 SNU CSE I Lab 56

57 57 c 2005 SNU CSE I Lab ayesan Drchlet Score Cont d T T φ + T T T T T T [ ] T 4 Γ + Γ 3 Γ + Γ 1 T T Γ + Γ Γ + Γ Γ + Γ + Γ Γ T T ~, T T Dr θ +

58 58 c 2005 SNU CSE I Lab ayesan Drchlet Score Cont d T T φ + T T T T T T [ ] T 4 Γ + Γ 3 Γ + Γ 1 T T Γ + Γ Γ + Γ Γ + Γ + Γ Γ T T ~, T T Dr θ +

59 ayesan Drchlet Score Cont d N [ ] D X ; a X G, θ θ G d D G θ 1 N a X Γ j Γ + N jk jk 1 j 1 Γ j + Nj k 1 Γ jk log D G n ayesan score s asymptotcally equvalent to IC Schwarz, 1978 and mnus the MDL crteron Rssanen, dm G log D G IC G; D log D Θˆ, G log M 2 dm G # of parameters n G c 2005 SNU CSE I Lab 59

60 Structure Search Gven a data set, a score metrc, and a set of possble structures, Fnd the network structure wth maxmal score. Dscrete optmzaton One can utlze the property of ndependent score for each par of X, ax. N 1 D X ; X G D G a Score G; D log D G N 1 Score X ; a X c 2005 SNU CSE I Lab 60

61 Tree-Structured Network Defnton: Each node has at most one parent. n effectve search algorthm exsts. Improvement over empty network Score G D Score X a + [ Score X a Score X ] Score for empty network Score X Chow and Lu 1968 Construct the undrected complete graph wth the weghts of edge EX, Xj beng IX; Xj. D uld a maxmum weghted spannng tree. C Transform to a drected tree wth an arbtrary root node. c 2005 SNU CSE I Lab 61

62 S1 S2 S M L L.. C L D L L Complete undrected graph I; I;D I;C I;D D I;C C IC;D Maxmum spannng tree C Drected tree C D D c 2005 SNU CSE I Lab 62

63 Search Strateges for General ayesan Networks Wth more than one parents per node N-hard Chckerng, 1996 eurstc search methods are usually employed. Greedy hll-clmbng local search Greedy hll-clmbng wth random restart Smulated annealng Tabu search c 2005 SNU CSE I Lab 63

64 Greedy Local Search lgorthm INUT: Data set, Scorng Metrc, ossble Structures C D Intalze the structure empty network, random network, tree-structured network, etc Do a local edge operaton resultng n the largest mprovement n score among all possble operatons INSERT C D ScoreG t+1 ;D>ScoreG t ;D YES DELETE REVERSE NO G fnal G t C D C D c 2005 SNU CSE I Lab 64

65 Enhanced Search Greedy local search can get stuck n local maxma or plateaux. Standard heurstcs to escape the two ncludes Search wth random restarts, Smulated annealng, Tabu search Genetc algorthm: a populaton-based search. score Local maxmum score Random restart Greedy search Greedy search wth random restarts score score Smulated annealng c 2005 SNU CSE I Lab opulaton-based search G 65

66 Learnng ayesan Networks: Summary Learnng ayesan networks Jont probablty dstrbuton product of condtonal probabltes for each varable node sum n log-based representaton. arameter Learnng Estmaton of local probablty dstrbuton MLE, ayesan estmaton. Structure Learnng Score metrcs Lkelhood, ayesan score, IC, MDL Search strateges Optmzaton n dscrete space maxmze the score Key concept: the decomposablty of the score Tree-structured and general ayesan networks. c 2005 SNU CSE I Lab 66

67 Introducton asc Concepts of ayesan Networks Learnng ayesan Networks arameter Learnng Structural Learnng pplcatons DN mcroarray Classfcaton Naïve ayes, TN Dependency nalyss Summary c 2005 SNU CSE I Lab 67

68 DN Mcroarray Data nalyss Classfcaton Gene expresson data of 72 leukema patents. Task: classfcaton of samples nto ML or LL based on expresson patterns usng ayesan networks. Combned analyss of gene expresson data and drug actvty data Gene expresson data and drug actvty data of 60 cancer samples Task: construct a dependency network of genes and drugs. Tools WEK collecton of machne learnng algorthms for data mnng tasks mplemented n JV, open source software ssued under the GNU General ublc Lcense ayesan network learnng algorthms are also ncluded. NJ software are used for vsualzaton when needed. c 2005 SNU CSE I Lab 68

69 DN Mcroarrays Recent developments n the technology for bologcal experments have made t possble to produce massve bologcal data sets. Montor thousands of gene expresson levels smultaneously. tradtonal one gene experments. parallel vew of the expresson patterns of thousands of genes n a cell. ayesan networks s a useful tool for the dentfcaton of a varety of meanngful relatonshps among genes from the data. c 2005 SNU CSE I Lab 69

70 ayesan Networks n Mcroarray Data nalyss Sample classfcaton - Dsease dagnoss Gene-gene relaton analyss - ctvaton or nhbton between genes Expresson profles ayesan network constructon c 2005 SNU CSE I Lab Gene regulatory network analyss - Global vew on the relatons among genes Combned analyss of other bologcal data - DN sequence data, drug actvty data, and so on. 70

71 DN Mcroarray Image analyss c 2005 SNU CSE I Lab 71

72 Data reparaton for Data Mnng Sample 1 Gene 2 Sample 1 Sample 2 Sample Image analyss Sample k Sample M Mcroarray mage samples Numercal data for data mnng c 2005 SNU CSE I Lab 72

73 Example 1: Tumor Type Classfcaton Task: classfcaton a leukema samples nto two classes based on gene expresson patterns usng ayesan networks Gene Gene D Gene Gene C Target - DN mcroarray data from two knds of leukema patents - Selected genes and the target varable Gene Gene D Gene Target Gene C - Learned ayesan network c 2005 SNU CSE I Lab 73

74 Data Sets 72 samples n total: 25 ML acute myelod leukema samples + 47 LL acute lymphoblastc leukema samples Golub et al., data sample conssts of expresson measurements for over 7,000 genes. reprocessng 30 nformatve genes were selected by the -metrc score g µ σ ML ML µ + σ LL LL Expresson values were dscretzed nto 2 levels, gh and Low. The fnal data s 72 samples of whch each conssts of gh or Low expresson values of 30 genes c 2005 SNU CSE I Lab 74

75 Classfcaton by ayesan Networks Classfcaton as an nference for a varable node n ayesan networks. ayes optmal classfer c * g arg max arg max c C c C h c g, D h c g, h h D f hˆ D 1 * c g argmax c, h ˆ c C g h ck, g h ck, g h ck g h ck, g g c, g l h l type, g type pc type ltc4s type zyxn type c _ myb type, pc fah type, ltc4s mb 1 type, fah c 2005 SNU CSE I Lab 75

76 Naïve ayes Classfer Very restrcted form of ayesan network Condtonal ndependence of all varables nodes gven the value of the class varable. Though smple, wdely used n classfcaton tasks up to now. type, s type s type s type c _ myb type zyxn type pc type ltc4s type fah type mb _1 type c 2005 SNU CSE I Lab 76

77 Tree-ugmented Network Fredman et al., 1997 Naïve ayes + treestructured network of varables but class node. Express the dependency between varables n a restrcted way. Class s the root node X N can have at most one parent, besdes Class node. zyxn type zyxn type, fah c 2005 SNU CSE I Lab 77

78 Results N TN Leave-one-out cross-valdaton ayesan network learnng wth 71 samples and test for the remanng one samples. 72 teraton 2_NO 2_N 6 genes 30 genes N 66/72 68/72 TN General N 69/72 66/72 68/72 67/72 3_NO 3_N max #pa 2 69/72 N 69/72 N General N 69/72 65/72 max #pa 3 70/72 N 67/72 N c 2005 SNU CSE I Lab 78

79 Markov lanket n N Markov lanket of the node type Markov lanket of the node T: T X\{T} T MT MT {at, ChT, acht\{t}} X 1 X 3 X 4 X 3 X 4 X 2 X 6 XT 5 X 6 X 7 X 8 X 7 X 8 TN X 9 X 10 c 2005 SNU CSE I Lab 79

80 Example 2: Gene-Drug Dependency nalyss Task: Construct a ayesan network for the combned analyss of gene expresson data and drug actvty data. Gene Expresson Data Drug actvty Data reprocessng - Thresholdng - Dscretzaton Gene Gene Drug Drug Cancer < Learned ayesan network > - Dependency analyss - robablstc nference Gene Drug Gene Drug Cancer - Selected genes, drugs and cancer type node ayesan network learnng c 2005 SNU CSE I Lab 80

81 Data Sets NCI60 data sets Scherf et al., genes and 1400 chemcal compounds from 60 human cancer samples. reprocessng 12 genes and 4 drugs were selected based the correlaton analyss result n Scherf et al. 2000: Consderng the learnng tme and vsualzaton of ayesan networks. Dscretze the gene expresson values and drug actvty values nto 3 levels, gh, Md, Low. Nodes n ayesan networks: 12 genes, 4 drugs, cancer type. c 2005 SNU CSE I Lab 81

82 Results Vsualzaton by NJ3 Software Identfcaton of a plausble dependency among 2 genes and 1 drugs c 2005 SNU CSE I Lab 82

83 Inferrng Gene Regulaton Model Reconstructon of regulatory relaton among genes from genome data. One of the challenges n reconstructon of gene regulatory networks usng ayesan networks s statstcal robustness. rses from the sparse nature underlyng gene expresson data. Usually, the number of samples are much smaller than that of genes attrbutes. Can produce unstable, spurous results. ootstrap, model averagng, ncorporaton of pror bologcal knowledge. c 2005 SNU CSE I Lab 83

84 ootstrap-based approach Multple ayesan networks from bootstrapped samples. Fredman, N. et al., 2000, e er, D. et al., 2001 Expresson Data Yeast cell-cycle data 76 samples for 800 genes Resamplng wth replacement 1 2 L G 1 G 2 G L Fredman et al., 2000 Estmate feature statstcs e.g., gene Y s n the Markov blanket of X 1 L conf f f G 1 L c 2005 SNU CSE I Lab Identfcaton of sgnfcant - parwse relatons Fredman, subnetworks e er,

85 Model averagng-based approach Estmate the confdence of features by averagng over posteror model N dstrbuton ertemnk et al., 2002 Expresson Data Locaton data serves as a pror score Structure learnng wth smulated annealng conf f L G D f DG, G 1 G 2 G L c 2005 SNU CSE I Lab 85

86 Incorporaton pror knowledge or assumpton. Impose bologcally-motvated assumptons on the network structure. Constructon of gene regulaton model from gene expresson data: nferrng regulator set e er, D., et al., 2002 ssumpton relatvely small set of genes s drectly nvolved n transcrptonal regulaton Lmt the number of canddate regulators Lmt the number of parents for each node gene and the number of genes havng outgong edges. Sgnfcantly reduce the number of possble models. Statstcal robustness of the dentfed regulator sets. c 2005 SNU CSE I Lab 86

87 3,622 genes n total Canddate regulator set 456 Gene expresson data Remove nonnformatve genes Dscretzaton 358 samples from combned data sets for S. cerevsae. Learnng wth structure constrants Scoreg;a g Ig;a g R 1 R 2 R 3 R M Regulator set 2-level network e er et al., 2002 Target c 2005 SNU CSE I Lab 87

88 Summary ayesan networks provde an effcent/effectve framework for organzng the body of knowledge by encodng the probablstc relatonshps among varables of nterest. Graph theory + probablty theory: DG + local probablty dstrbuton. Condtonal ndependence and condtonal probablty are keystones. compact way to express complex systems by smpler probablstc modules and thus a natural framework for dealng wth complexty and uncertanty. Two problems n the learnng of ayesan networks from data arameter estmaton: MLE, M, ayesan estmaton Structural learnng: tree-structured network, heurstc search for general ayesan network. c 2005 SNU CSE I Lab 88

89 Summary Cont d Not covered n ths tutoral but mportant ssues. robablstc nference n general ayesan networks Exact & approxmate nference Incomplete data mssng data, hdden varables Known structure & unknown structure Expectaton-Maxmzaton algorthm can be employed as n latent varable models. ayesan model averagng over structures as well as parameters. Dynamc ayesan networks for temporal data. Many applcatons Text and web mnng Medcal dagnoss Intellgent agent system onformatcs c 2005 SNU CSE I Lab 89

90 References ayesan Networks papers & books Chow, C. and Lu, C., pproxmatng dscrete probablty dstrbutons wth dependency trees, IEEE Transactons on Informaton Theory, 14, pp , earl, J., robablstc Reasonng n Intellgent Systems: Networks of lausble Inference, Neapoltan, E., robablstc Reasonng n Expert Systems, Charnak, E., ayesan networks wthout tears, I Magazne, 124:50-63, eckerman, Learnng ayesan networks: the combnaton of knowledge and statstcal data, Machne Leanrng, 20, , Jensen, F.V., n Introducton to ayesan Networks, Sprnger-Verlag, Fredman, N., Geger, D., and Goldszmdt, M., ayesan network classfers, Machne Learnng, 292, pp , Mtchell, T.M., Machne Learnng, The McGraw-ll Companes, Frey,.J., Graphcal Models for Machne Learnng and Dgtal Communcaton, MIT ress, Jordan, M. I. eds., Learnng n Graphcal Models, Kluwer cademc ublshers, Fredman, N. and Goldszmdt, M., Learnng ayesan networks wth local structure, In Learnng n Graphcal Models Ed. Jordan, M. I., pp , MIT ress, eckerman, D., tutoral on learnng wth ayesan networks, In Learnng wth Graphcal Models Ed. Jordan, M. I., pp , MIT ress, J. earl and S. Russel, ayesan networks, UCL Cogntve Systems Laboratory, Techncal Report R- 277, November Sprtes,., Glymour, C., and Schenes, R., Causaton, redcton, and Search, 2 nd edton, MIT ress, Jensen, F. V., ayesan Networks and Decson Graphs, Sprnger, J. earl, ayesan networks, causal nference and knowledge dscovery, UCL Cogntve Systems Laboratory, Techncal Report R-281, March Korb, K.. and Ncholson,.., ayesan rtfcal Intellgence, CRC ress, c 2005 SNU CSE I Lab 90

91 Graphcal models and ayesan networks tutorals & lectures Fredman, N. and Koller, D., Learnng ayesan Networks from Data, Murphy, K., ref Introducton to Graphcal Models and ayesan Networks Ghahraman, Z., robablstc Models for Unsupervsed Learnng, Ghahraman, Z., ayesan Methods for Machne Learnng, Moser,., robablstc Independence Networks I-II, 2/ndex.htm, 2/ndex.htm oet, M., Specal Topcs: elef Networks, c 2005 SNU CSE I Lab 91

92 ayesan network wth applcaton to bonformatcs Murphy, K. and Man, S., Modelng gene expresson data usng dynamc ayesan networks, Techncal. report 1999: Computer Scence Dvson, Unversty of Calforna, erkeley, C Fredman, N., Lnal, M., Nachman, I., and e er, D., Usng ayesan networks to analyze expresson data, Journal of Computatonal ology, 73/4, pp , Ca, D., Delcher,., Kao,., and Kasf, S. Modelng splce stes wth ayes networks, onformatcs, 2000, 162, pp , e er, D., Regev,., Eldan, G., and Fredman, N., Inferrng subnetworks from perturbed expresson profles, onformatcs, 17suppl 1, pp. S215-S224, e er, D., Regev,., and Tanay,., Mnreg: nferrng an actve regulator set, onformatcs, 18suppl 1, pp , artemnk,. J., Gfford, D. K., Jaakkola, T. S., and Young, R.., Combnng locaton and expresson data for prncpled dscovery of genetc regulatory network models, acfc Symposum on ocomputng, 7, pp , Imoto, S., Km, S., Goto, T., buratan, S., Tashro, K., Kuhara, S., and Myano, S., ayesan network and nonparametrc heteroscedastc regresson for nonlnear modelng of genetc network, Journal of onformatcs and Computatonal ology, 12, pp , Ronald, J., et al., ayesan networks approach for predctng proten-proten nteractons from genomc data, Scence 302: , Olga G. Troyanskaya, O. G., Dolnsk, K., Owen,.., ltman, R.., and otsten, D., ayesan framework for combnng heterogeneous data sources for gene functon predcton n Saccharomyces cerevsae, NS, 10014, pp , eer M.. and Tavazoe, S. redctng gene expresson from sequence. Cell, 1172, pp , Fredman, N., Inferrng cellular networks usng probablstc graphcal models, Scence, , pp , comprehensve lst s avalable at c 2005 SNU CSE I Lab 92

93 ayesan network software packages ayes Net Toolbox Kevn Murphy varety of algorthms for learnng and nference n graphcal models wrtten n MTL. WEK ayesan network learnng and classfcaton modules are ncluded among a collecton of machne learnng algorthms wrtten n JV. Detaled lsts and comparson are referred to ayesan network lst n Google ce/elef_networks/software/ Korb, K.. and Ncholson,.., ayesan rtfcal Intellgence, CRC ress, c 2005 SNU CSE I Lab 93

EM and Structure Learning

EM and Structure Learning EM and Structure Learnng Le Song Machne Learnng II: Advanced Topcs CSE 8803ML, Sprng 2012 Partally observed graphcal models Mxture Models N(μ 1, Σ 1 ) Z X N N(μ 2, Σ 2 ) 2 Gaussan mxture model Consder

More information

Outline. Bayesian Networks: Maximum Likelihood Estimation and Tree Structure Learning. Our Model and Data. Outline

Outline. Bayesian Networks: Maximum Likelihood Estimation and Tree Structure Learning. Our Model and Data. Outline Outlne Bayesan Networks: Maxmum Lkelhood Estmaton and Tree Structure Learnng Huzhen Yu janey.yu@cs.helsnk.f Dept. Computer Scence, Unv. of Helsnk Probablstc Models, Sprng, 200 Notces: I corrected a number

More information

Speech and Language Processing

Speech and Language Processing Speech and Language rocessng Lecture 3 ayesan network and ayesan nference Informaton and ommuncatons Engneerng ourse Takahro Shnozak 08//5 Lecture lan (Shnozak s part) I gves the frst 6 lectures about

More information

Hidden Markov Models

Hidden Markov Models CM229S: Machne Learnng for Bonformatcs Lecture 12-05/05/2016 Hdden Markov Models Lecturer: Srram Sankararaman Scrbe: Akshay Dattatray Shnde Edted by: TBD 1 Introducton For a drected graph G we can wrte

More information

Artificial Intelligence Bayesian Networks

Artificial Intelligence Bayesian Networks Artfcal Intellgence Bayesan Networks Adapted from sldes by Tm Fnn and Mare desjardns. Some materal borrowed from Lse Getoor. 1 Outlne Bayesan networks Network structure Condtonal probablty tables Condtonal

More information

Finite Mixture Models and Expectation Maximization. Most slides are from: Dr. Mario Figueiredo, Dr. Anil Jain and Dr. Rong Jin

Finite Mixture Models and Expectation Maximization. Most slides are from: Dr. Mario Figueiredo, Dr. Anil Jain and Dr. Rong Jin Fnte Mxture Models and Expectaton Maxmzaton Most sldes are from: Dr. Maro Fgueredo, Dr. Anl Jan and Dr. Rong Jn Recall: The Supervsed Learnng Problem Gven a set of n samples X {(x, y )},,,n Chapter 3 of

More information

Department of Computer Science Artificial Intelligence Research Laboratory. Iowa State University MACHINE LEARNING

Department of Computer Science Artificial Intelligence Research Laboratory. Iowa State University MACHINE LEARNING MACHINE LEANING Vasant Honavar Bonformatcs and Computatonal Bology rogram Center for Computatonal Intellgence, Learnng, & Dscovery Iowa State Unversty honavar@cs.astate.edu www.cs.astate.edu/~honavar/

More information

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010 Parametrc fractonal mputaton for mssng data analyss Jae Kwang Km Survey Workng Group Semnar March 29, 2010 1 Outlne Introducton Proposed method Fractonal mputaton Approxmaton Varance estmaton Multple mputaton

More information

Cell Biology. Lecture 1: 10-Oct-12. Marco Grzegorczyk. (Gen-)Regulatory Network. Microarray Chips. (Gen-)Regulatory Network. (Gen-)Regulatory Network

Cell Biology. Lecture 1: 10-Oct-12. Marco Grzegorczyk. (Gen-)Regulatory Network. Microarray Chips. (Gen-)Regulatory Network. (Gen-)Regulatory Network 5.0.202 Genetsche Netzwerke Wntersemester 202/203 ell ology Lecture : 0-Oct-2 Marco Grzegorczyk Gen-Regulatory Network Mcroarray hps G G 2 G 3 2 3 metabolte metabolte Gen-Regulatory Network Gen-Regulatory

More information

Evaluation for sets of classes

Evaluation for sets of classes Evaluaton for Tet Categorzaton Classfcaton accuracy: usual n ML, the proporton of correct decsons, Not approprate f the populaton rate of the class s low Precson, Recall and F 1 Better measures 21 Evaluaton

More information

CS 2750 Machine Learning. Lecture 5. Density estimation. CS 2750 Machine Learning. Announcements

CS 2750 Machine Learning. Lecture 5. Density estimation. CS 2750 Machine Learning. Announcements CS 750 Machne Learnng Lecture 5 Densty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square CS 750 Machne Learnng Announcements Homework Due on Wednesday before the class Reports: hand n before

More information

MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2)

MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2) 1/16 MATH 829: Introducton to Data Mnng and Analyss The EM algorthm (part 2) Domnque Gullot Departments of Mathematcal Scences Unversty of Delaware Aprl 20, 2016 Recall 2/16 We are gven ndependent observatons

More information

Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement

Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement Markov Chan Monte Carlo MCMC, Gbbs Samplng, Metropols Algorthms, and Smulated Annealng 2001 Bonformatcs Course Supplement SNU Bontellgence Lab http://bsnuackr/ Outlne! Markov Chan Monte Carlo MCMC! Metropols-Hastngs

More information

On an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1

On an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1 On an Extenson of Stochastc Approxmaton EM Algorthm for Incomplete Data Problems Vahd Tadayon Abstract: The Stochastc Approxmaton EM (SAEM algorthm, a varant stochastc approxmaton of EM, s a versatle tool

More information

Hidden Markov Models & The Multivariate Gaussian (10/26/04)

Hidden Markov Models & The Multivariate Gaussian (10/26/04) CS281A/Stat241A: Statstcal Learnng Theory Hdden Markov Models & The Multvarate Gaussan (10/26/04) Lecturer: Mchael I. Jordan Scrbes: Jonathan W. Hu 1 Hdden Markov Models As a bref revew, hdden Markov models

More information

A Bayes Algorithm for the Multitask Pattern Recognition Problem Direct Approach

A Bayes Algorithm for the Multitask Pattern Recognition Problem Direct Approach A Bayes Algorthm for the Multtask Pattern Recognton Problem Drect Approach Edward Puchala Wroclaw Unversty of Technology, Char of Systems and Computer etworks, Wybrzeze Wyspanskego 7, 50-370 Wroclaw, Poland

More information

MIMA Group. Chapter 2 Bayesian Decision Theory. School of Computer Science and Technology, Shandong University. Xin-Shun SDU

MIMA Group. Chapter 2 Bayesian Decision Theory. School of Computer Science and Technology, Shandong University. Xin-Shun SDU Group M D L M Chapter Bayesan Decson heory Xn-Shun Xu @ SDU School of Computer Scence and echnology, Shandong Unversty Bayesan Decson heory Bayesan decson theory s a statstcal approach to data mnng/pattern

More information

Machine learning: Density estimation

Machine learning: Density estimation CS 70 Foundatons of AI Lecture 3 Machne learnng: ensty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square ata: ensty estmaton {.. n} x a vector of attrbute values Objectve: estmate the model of

More information

Course 395: Machine Learning - Lectures

Course 395: Machine Learning - Lectures Course 395: Machne Learnng - Lectures Lecture 1-2: Concept Learnng (M. Pantc Lecture 3-4: Decson Trees & CC Intro (M. Pantc Lecture 5-6: Artfcal Neural Networks (S.Zaferou Lecture 7-8: Instance ased Learnng

More information

Space of ML Problems. CSE 473: Artificial Intelligence. Parameter Estimation and Bayesian Networks. Learning Topics

Space of ML Problems. CSE 473: Artificial Intelligence. Parameter Estimation and Bayesian Networks. Learning Topics /7/7 CSE 73: Artfcal Intellgence Bayesan - Learnng Deter Fox Sldes adapted from Dan Weld, Jack Breese, Dan Klen, Daphne Koller, Stuart Russell, Andrew Moore & Luke Zettlemoyer What s Beng Learned? Space

More information

CSC321 Tutorial 9: Review of Boltzmann machines and simulated annealing

CSC321 Tutorial 9: Review of Boltzmann machines and simulated annealing CSC321 Tutoral 9: Revew of Boltzmann machnes and smulated annealng (Sldes based on Lecture 16-18 and selected readngs) Yue L Emal: yuel@cs.toronto.edu Wed 11-12 March 19 Fr 10-11 March 21 Outlne Boltzmann

More information

Introduction to Hidden Markov Models

Introduction to Hidden Markov Models Introducton to Hdden Markov Models Alperen Degrmenc Ths document contans dervatons and algorthms for mplementng Hdden Markov Models. The content presented here s a collecton of my notes and personal nsghts

More information

Chapter 9: Statistical Inference and the Relationship between Two Variables

Chapter 9: Statistical Inference and the Relationship between Two Variables Chapter 9: Statstcal Inference and the Relatonshp between Two Varables Key Words The Regresson Model The Sample Regresson Equaton The Pearson Correlaton Coeffcent Learnng Outcomes After studyng ths chapter,

More information

CHAPTER 3: BAYESIAN DECISION THEORY

CHAPTER 3: BAYESIAN DECISION THEORY HATER 3: BAYESIAN DEISION THEORY Decson mang under uncertanty 3 Data comes from a process that s completely not nown The lac of nowledge can be compensated by modelng t as a random process May be the underlyng

More information

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI Logstc Regresson CAP 561: achne Learnng Instructor: Guo-Jun QI Bayes Classfer: A Generatve model odel the posteror dstrbuton P(Y X) Estmate class-condtonal dstrbuton P(X Y) for each Y Estmate pror dstrbuton

More information

Bayesian Learning. Smart Home Health Analytics Spring Nirmalya Roy Department of Information Systems University of Maryland Baltimore County

Bayesian Learning. Smart Home Health Analytics Spring Nirmalya Roy Department of Information Systems University of Maryland Baltimore County Smart Home Health Analytcs Sprng 2018 Bayesan Learnng Nrmalya Roy Department of Informaton Systems Unversty of Maryland Baltmore ounty www.umbc.edu Bayesan Learnng ombnes pror knowledge wth evdence to

More information

Bayesian belief networks

Bayesian belief networks CS 1571 Introducton to I Lecture 24 ayesan belef networks los Hauskrecht mlos@cs.ptt.edu 5329 Sennott Square CS 1571 Intro to I dmnstraton Homework assgnment 10 s out and due next week Fnal exam: December

More information

Statistics for Economics & Business

Statistics for Economics & Business Statstcs for Economcs & Busness Smple Lnear Regresson Learnng Objectves In ths chapter, you learn: How to use regresson analyss to predct the value of a dependent varable based on an ndependent varable

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

Marginal Models for categorical data.

Marginal Models for categorical data. Margnal Models for categorcal data Applcaton to condtonal ndependence and graphcal models Wcher Bergsma 1 Marcel Croon 2 Jacques Hagenaars 2 Tamas Rudas 3 1 London School of Economcs and Poltcal Scence

More information

A New Evolutionary Computation Based Approach for Learning Bayesian Network

A New Evolutionary Computation Based Approach for Learning Bayesian Network Avalable onlne at www.scencedrect.com Proceda Engneerng 15 (2011) 4026 4030 Advanced n Control Engneerng and Informaton Scence A New Evolutonary Computaton Based Approach for Learnng Bayesan Network Yungang

More information

Structure Learning. Instructor: Su-In Lee University of Washington, Seattle. Score-based structure learning

Structure Learning. Instructor: Su-In Lee University of Washington, Seattle. Score-based structure learning Readngs: K&F 18.3, 18.4, 18.5, 18.6 Structure Learnng Lecture 11 ay 2, 2011 SE 515, Statstcal ethods, Sprng 2011 Instructor: Su-In Lee Unversty of Washngton, Seattle Last Tme Score-based structure learnng

More information

Stat260: Bayesian Modeling and Inference Lecture Date: February 22, Reference Priors

Stat260: Bayesian Modeling and Inference Lecture Date: February 22, Reference Priors Stat60: Bayesan Modelng and Inference Lecture Date: February, 00 Reference Prors Lecturer: Mchael I. Jordan Scrbe: Steven Troxler and Wayne Lee In ths lecture, we assume that θ R; n hgher-dmensons, reference

More information

Learning undirected Models. Instructor: Su-In Lee University of Washington, Seattle. Mean Field Approximation

Learning undirected Models. Instructor: Su-In Lee University of Washington, Seattle. Mean Field Approximation Readngs: K&F 0.3, 0.4, 0.6, 0.7 Learnng undrected Models Lecture 8 June, 0 CSE 55, Statstcal Methods, Sprng 0 Instructor: Su-In Lee Unversty of Washngton, Seattle Mean Feld Approxmaton Is the energy functonal

More information

Hidden Markov Models

Hidden Markov Models Hdden Markov Models Namrata Vaswan, Iowa State Unversty Aprl 24, 204 Hdden Markov Model Defntons and Examples Defntons:. A hdden Markov model (HMM) refers to a set of hdden states X 0, X,..., X t,...,

More information

The Expectation-Maximization Algorithm

The Expectation-Maximization Algorithm The Expectaton-Maxmaton Algorthm Charles Elan elan@cs.ucsd.edu November 16, 2007 Ths chapter explans the EM algorthm at multple levels of generalty. Secton 1 gves the standard hgh-level verson of the algorthm.

More information

BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS. M. Krishna Reddy, B. Naveen Kumar and Y. Ramu

BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS. M. Krishna Reddy, B. Naveen Kumar and Y. Ramu BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS M. Krshna Reddy, B. Naveen Kumar and Y. Ramu Department of Statstcs, Osmana Unversty, Hyderabad -500 007, Inda. nanbyrozu@gmal.com, ramu0@gmal.com

More information

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore Sesson Outlne Introducton to classfcaton problems and dscrete choce models. Introducton to Logstcs Regresson. Logstc functon and Logt functon. Maxmum Lkelhood Estmator (MLE) for estmaton of LR parameters.

More information

CIS587 - Artificial Intellgence. Bayesian Networks CIS587 - AI. KB for medical diagnosis. Example.

CIS587 - Artificial Intellgence. Bayesian Networks CIS587 - AI. KB for medical diagnosis. Example. CIS587 - Artfcal Intellgence Bayesan Networks KB for medcal dagnoss. Example. We want to buld a KB system for the dagnoss of pneumona. Problem descrpton: Dsease: pneumona Patent symptoms (fndngs, lab tests):

More information

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family IOSR Journal of Mathematcs IOSR-JM) ISSN: 2278-5728. Volume 3, Issue 3 Sep-Oct. 202), PP 44-48 www.osrjournals.org Usng T.O.M to Estmate Parameter of dstrbutons that have not Sngle Exponental Famly Jubran

More information

Probability Theory (revisited)

Probability Theory (revisited) Probablty Theory (revsted) Summary Probablty v.s. plausblty Random varables Smulaton of Random Experments Challenge The alarm of a shop rang. Soon afterwards, a man was seen runnng n the street, persecuted

More information

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results. Neural Networks : Dervaton compled by Alvn Wan from Professor Jtendra Malk s lecture Ths type of computaton s called deep learnng and s the most popular method for many problems, such as computer vson

More information

Composite Hypotheses testing

Composite Hypotheses testing Composte ypotheses testng In many hypothess testng problems there are many possble dstrbutons that can occur under each of the hypotheses. The output of the source s a set of parameters (ponts n a parameter

More information

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6 Department of Quanttatve Methods & Informaton Systems Tme Seres and Ther Components QMIS 30 Chapter 6 Fall 00 Dr. Mohammad Zanal These sldes were modfed from ther orgnal source for educatonal purpose only.

More information

Probabilistic & Unsupervised Learning

Probabilistic & Unsupervised Learning Probablstc & Unsupervsed Learnng Convex Algorthms n Approxmate Inference Yee Whye Teh ywteh@gatsby.ucl.ac.uk Gatsby Computatonal Neuroscence Unt Unversty College London Term 1, Autumn 2008 Convexty A convex

More information

Markov Chain Monte Carlo Lecture 6

Markov Chain Monte Carlo Lecture 6 where (x 1,..., x N ) X N, N s called the populaton sze, f(x) f (x) for at least one {1, 2,..., N}, and those dfferent from f(x) are called the tral dstrbutons n terms of mportance samplng. Dfferent ways

More information

Retrieval Models: Language models

Retrieval Models: Language models CS-590I Informaton Retreval Retreval Models: Language models Luo S Department of Computer Scence Purdue Unversty Introducton to language model Ungram language model Document language model estmaton Maxmum

More information

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers Psychology 282 Lecture #24 Outlne Regresson Dagnostcs: Outlers In an earler lecture we studed the statstcal assumptons underlyng the regresson model, ncludng the followng ponts: Formal statement of assumptons.

More information

Chapter 8 Indicator Variables

Chapter 8 Indicator Variables Chapter 8 Indcator Varables In general, e explanatory varables n any regresson analyss are assumed to be quanttatve n nature. For example, e varables lke temperature, dstance, age etc. are quanttatve n

More information

Uncertainty in measurements of power and energy on power networks

Uncertainty in measurements of power and energy on power networks Uncertanty n measurements of power and energy on power networks E. Manov, N. Kolev Department of Measurement and Instrumentaton, Techncal Unversty Sofa, bul. Klment Ohrdsk No8, bl., 000 Sofa, Bulgara Tel./fax:

More information

Quantifying Uncertainty

Quantifying Uncertainty Partcle Flters Quantfyng Uncertanty Sa Ravela M. I. T Last Updated: Sprng 2013 1 Quantfyng Uncertanty Partcle Flters Partcle Flters Appled to Sequental flterng problems Can also be appled to smoothng problems

More information

Semi-supervised Classification with Active Query Selection

Semi-supervised Classification with Active Query Selection Sem-supervsed Classfcaton wth Actve Query Selecton Jao Wang and Swe Luo School of Computer and Informaton Technology, Beng Jaotong Unversty, Beng 00044, Chna Wangjao088@63.com Abstract. Labeled samples

More information

LINEAR REGRESSION ANALYSIS. MODULE VIII Lecture Indicator Variables

LINEAR REGRESSION ANALYSIS. MODULE VIII Lecture Indicator Variables LINEAR REGRESSION ANALYSIS MODULE VIII Lecture - 7 Indcator Varables Dr. Shalabh Department of Maematcs and Statstcs Indan Insttute of Technology Kanpur Indcator varables versus quanttatve explanatory

More information

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U) Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of

More information

EEE 241: Linear Systems

EEE 241: Linear Systems EEE : Lnear Systems Summary #: Backpropagaton BACKPROPAGATION The perceptron rule as well as the Wdrow Hoff learnng were desgned to tran sngle layer networks. They suffer from the same dsadvantage: they

More information

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:

More information

A Robust Method for Calculating the Correlation Coefficient

A Robust Method for Calculating the Correlation Coefficient A Robust Method for Calculatng the Correlaton Coeffcent E.B. Nven and C. V. Deutsch Relatonshps between prmary and secondary data are frequently quantfed usng the correlaton coeffcent; however, the tradtonal

More information

x i1 =1 for all i (the constant ).

x i1 =1 for all i (the constant ). Chapter 5 The Multple Regresson Model Consder an economc model where the dependent varable s a functon of K explanatory varables. The economc model has the form: y = f ( x,x,..., ) xk Approxmate ths by

More information

The Geometry of Logit and Probit

The Geometry of Logit and Probit The Geometry of Logt and Probt Ths short note s meant as a supplement to Chapters and 3 of Spatal Models of Parlamentary Votng and the notaton and reference to fgures n the text below s to those two chapters.

More information

Feature Selection: Part 1

Feature Selection: Part 1 CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?

More information

MAXIMUM A POSTERIORI TRANSDUCTION

MAXIMUM A POSTERIORI TRANSDUCTION MAXIMUM A POSTERIORI TRANSDUCTION LI-WEI WANG, JU-FU FENG School of Mathematcal Scences, Peng Unversty, Bejng, 0087, Chna Center for Informaton Scences, Peng Unversty, Bejng, 0087, Chna E-MIAL: {wanglw,

More information

Comparison of the Population Variance Estimators. of 2-Parameter Exponential Distribution Based on. Multiple Criteria Decision Making Method

Comparison of the Population Variance Estimators. of 2-Parameter Exponential Distribution Based on. Multiple Criteria Decision Making Method Appled Mathematcal Scences, Vol. 7, 0, no. 47, 07-0 HIARI Ltd, www.m-hkar.com Comparson of the Populaton Varance Estmators of -Parameter Exponental Dstrbuton Based on Multple Crtera Decson Makng Method

More information

Comparison of Regression Lines

Comparison of Regression Lines STATGRAPHICS Rev. 9/13/2013 Comparson of Regresson Lnes Summary... 1 Data Input... 3 Analyss Summary... 4 Plot of Ftted Model... 6 Condtonal Sums of Squares... 6 Analyss Optons... 7 Forecasts... 8 Confdence

More information

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition)

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition) Count Data Models See Book Chapter 11 2 nd Edton (Chapter 10 1 st Edton) Count data consst of non-negatve nteger values Examples: number of drver route changes per week, the number of trp departure changes

More information

Conjugacy and the Exponential Family

Conjugacy and the Exponential Family CS281B/Stat241B: Advanced Topcs n Learnng & Decson Makng Conjugacy and the Exponental Famly Lecturer: Mchael I. Jordan Scrbes: Bran Mlch 1 Conjugacy In the prevous lecture, we saw conjugate prors for the

More information

/ n ) are compared. The logic is: if the two

/ n ) are compared. The logic is: if the two STAT C141, Sprng 2005 Lecture 13 Two sample tests One sample tests: examples of goodness of ft tests, where we are testng whether our data supports predctons. Two sample tests: called as tests of ndependence

More information

Support Vector Machines. Vibhav Gogate The University of Texas at dallas

Support Vector Machines. Vibhav Gogate The University of Texas at dallas Support Vector Machnes Vbhav Gogate he Unversty of exas at dallas What We have Learned So Far? 1. Decson rees. Naïve Bayes 3. Lnear Regresson 4. Logstc Regresson 5. Perceptron 6. Neural networks 7. K-Nearest

More information

Lecture 3: Probability Distributions

Lecture 3: Probability Distributions Lecture 3: Probablty Dstrbutons Random Varables Let us begn by defnng a sample space as a set of outcomes from an experment. We denote ths by S. A random varable s a functon whch maps outcomes nto the

More information

Multi-Conditional Learning for Joint Probability Models with Latent Variables

Multi-Conditional Learning for Joint Probability Models with Latent Variables Mult-Condtonal Learnng for Jont Probablty Models wth Latent Varables Chrs Pal, Xueru Wang, Mchael Kelm and Andrew McCallum Department of Computer Scence Unversty of Massachusetts Amherst Amherst, MA USA

More information

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction ECONOMICS 5* -- NOTE (Summary) ECON 5* -- NOTE The Multple Classcal Lnear Regresson Model (CLRM): Specfcaton and Assumptons. Introducton CLRM stands for the Classcal Lnear Regresson Model. The CLRM s also

More information

Clustering gene expression data & the EM algorithm

Clustering gene expression data & the EM algorithm CG, Fall 2011-12 Clusterng gene expresson data & the EM algorthm CG 08 Ron Shamr 1 How Gene Expresson Data Looks Entres of the Raw Data matrx: Rato values Absolute values Row = gene s expresson pattern

More information

Bayesian decision theory. Nuno Vasconcelos ECE Department, UCSD

Bayesian decision theory. Nuno Vasconcelos ECE Department, UCSD Bayesan decson theory Nuno Vasconcelos ECE Department UCSD Notaton the notaton n DHS s qute sloppy e.. show that error error z z dz really not clear what ths means we wll use the follown notaton subscrpts

More information

Basic Business Statistics, 10/e

Basic Business Statistics, 10/e Chapter 13 13-1 Basc Busness Statstcs 11 th Edton Chapter 13 Smple Lnear Regresson Basc Busness Statstcs, 11e 009 Prentce-Hall, Inc. Chap 13-1 Learnng Objectves In ths chapter, you learn: How to use regresson

More information

1/10/18. Definitions. Probabilistic models. Why probabilistic models. Example: a fair 6-sided dice. Probability

1/10/18. Definitions. Probabilistic models. Why probabilistic models. Example: a fair 6-sided dice. Probability /0/8 I529: Machne Learnng n Bonformatcs Defntons Probablstc models Probablstc models A model means a system that smulates the object under consderaton A probablstc model s one that produces dfferent outcomes

More information

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number

More information

Non-Informative Dirichlet Score for learning Bayesian networks

Non-Informative Dirichlet Score for learning Bayesian networks Non-Informatve Drchlet Score for learnng Bayesan networks Maom Ueno Unversty of Electro-Communcatons, Japan ueno@a.s.uec.ac.jp Masak Uto Unversty of Electro-Communcatons, Japan uto masak@a.s.uec.ac.jp

More information

Generalized Linear Methods

Generalized Linear Methods Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set

More information

Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation

Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation Statstcs for Managers Usng Mcrosoft Excel/SPSS Chapter 13 The Smple Lnear Regresson Model and Correlaton 1999 Prentce-Hall, Inc. Chap. 13-1 Chapter Topcs Types of Regresson Models Determnng the Smple Lnear

More information

SDMML HT MSc Problem Sheet 4

SDMML HT MSc Problem Sheet 4 SDMML HT 06 - MSc Problem Sheet 4. The recever operatng characterstc ROC curve plots the senstvty aganst the specfcty of a bnary classfer as the threshold for dscrmnaton s vared. Let the data space be

More information

Classification Bayesian Classifiers

Classification Bayesian Classifiers lassfcaton Bayesan lassfers Jeff Howbert Introducton to Machne Learnng Wnter 2014 1 Bayesan classfcaton A robablstc framework for solvng classfcaton roblems. Used where class assgnment s not determnstc,.e.

More information

Statistics for Business and Economics

Statistics for Business and Economics Statstcs for Busness and Economcs Chapter 11 Smple Regresson Copyrght 010 Pearson Educaton, Inc. Publshng as Prentce Hall Ch. 11-1 11.1 Overvew of Lnear Models n An equaton can be ft to show the best lnear

More information

Sparse Gaussian Processes Using Backward Elimination

Sparse Gaussian Processes Using Backward Elimination Sparse Gaussan Processes Usng Backward Elmnaton Lefeng Bo, Lng Wang, and Lcheng Jao Insttute of Intellgent Informaton Processng and Natonal Key Laboratory for Radar Sgnal Processng, Xdan Unversty, X an

More information

Decision-making and rationality

Decision-making and rationality Reslence Informatcs for Innovaton Classcal Decson Theory RRC/TMI Kazuo URUTA Decson-makng and ratonalty What s decson-makng? Methodology for makng a choce The qualty of decson-makng determnes success or

More information

A quantum-statistical-mechanical extension of Gaussian mixture model

A quantum-statistical-mechanical extension of Gaussian mixture model A quantum-statstcal-mechancal extenson of Gaussan mxture model Kazuyuk Tanaka, and Koj Tsuda 2 Graduate School of Informaton Scences, Tohoku Unversty, 6-3-09 Aramak-aza-aoba, Aoba-ku, Senda 980-8579, Japan

More information

Lecture 10 Support Vector Machines II

Lecture 10 Support Vector Machines II Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed

More information

Homework Assignment 3 Due in class, Thursday October 15

Homework Assignment 3 Due in class, Thursday October 15 Homework Assgnment 3 Due n class, Thursday October 15 SDS 383C Statstcal Modelng I 1 Rdge regresson and Lasso 1. Get the Prostrate cancer data from http://statweb.stanford.edu/~tbs/elemstatlearn/ datasets/prostate.data.

More information

The Gaussian classifier. Nuno Vasconcelos ECE Department, UCSD

The Gaussian classifier. Nuno Vasconcelos ECE Department, UCSD he Gaussan classfer Nuno Vasconcelos ECE Department, UCSD Bayesan decson theory recall that we have state of the world X observatons g decson functon L[g,y] loss of predctng y wth g Bayes decson rule s

More information

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M CIS56: achne Learnng Lecture 3 (Sept 6, 003) Preparaton help: Xaoyng Huang Lnear Regresson Lnear regresson can be represented by a functonal form: f(; θ) = θ 0 0 +θ + + θ = θ = 0 ote: 0 s a dummy attrbute

More information

10-701/ Machine Learning, Fall 2005 Homework 3

10-701/ Machine Learning, Fall 2005 Homework 3 10-701/15-781 Machne Learnng, Fall 2005 Homework 3 Out: 10/20/05 Due: begnnng of the class 11/01/05 Instructons Contact questons-10701@autonlaborg for queston Problem 1 Regresson and Cross-valdaton [40

More information

Answers Problem Set 2 Chem 314A Williamsen Spring 2000

Answers Problem Set 2 Chem 314A Williamsen Spring 2000 Answers Problem Set Chem 314A Wllamsen Sprng 000 1) Gve me the followng crtcal values from the statstcal tables. a) z-statstc,-sded test, 99.7% confdence lmt ±3 b) t-statstc (Case I), 1-sded test, 95%

More information

Bayesian predictive Configural Frequency Analysis

Bayesian predictive Configural Frequency Analysis Psychologcal Test and Assessment Modelng, Volume 54, 2012 (3), 285-292 Bayesan predctve Confgural Frequency Analyss Eduardo Gutérrez-Peña 1 Abstract Confgural Frequency Analyss s a method for cell-wse

More information

STAT 3008 Applied Regression Analysis

STAT 3008 Applied Regression Analysis STAT 3008 Appled Regresson Analyss Tutoral : Smple Lnear Regresson LAI Chun He Department of Statstcs, The Chnese Unversty of Hong Kong 1 Model Assumpton To quantfy the relatonshp between two factors,

More information

Classification as a Regression Problem

Classification as a Regression Problem Target varable y C C, C,, ; Classfcaton as a Regresson Problem { }, 3 L C K To treat classfcaton as a regresson problem we should transform the target y nto numercal values; The choce of numercal class

More information

xp(x µ) = 0 p(x = 0 µ) + 1 p(x = 1 µ) = µ

xp(x µ) = 0 p(x = 0 µ) + 1 p(x = 1 µ) = µ CSE 455/555 Sprng 2013 Homework 7: Parametrc Technques Jason J. Corso Computer Scence and Engneerng SUY at Buffalo jcorso@buffalo.edu Solutons by Yngbo Zhou Ths assgnment does not need to be submtted and

More information

Numerical Heat and Mass Transfer

Numerical Heat and Mass Transfer Master degree n Mechancal Engneerng Numercal Heat and Mass Transfer 06-Fnte-Dfference Method (One-dmensonal, steady state heat conducton) Fausto Arpno f.arpno@uncas.t Introducton Why we use models and

More information

Nonlinear Classifiers II

Nonlinear Classifiers II Nonlnear Classfers II Nonlnear Classfers: Introducton Classfers Supervsed Classfers Lnear Classfers Perceptron Least Squares Methods Lnear Support Vector Machne Nonlnear Classfers Part I: Mult Layer Neural

More information

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA 4 Analyss of Varance (ANOVA) 5 ANOVA 51 Introducton ANOVA ANOVA s a way to estmate and test the means of multple populatons We wll start wth one-way ANOVA If the populatons ncluded n the study are selected

More information

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Analyss of Varance and Desgn of Experment-I MODULE VII LECTURE - 3 ANALYSIS OF COVARIANCE Dr Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur Any scentfc experment s performed

More information

Statistical learning

Statistical learning Statstcal learnng Model the data generaton process Learn the model parameters Crteron to optmze: Lkelhood of the dataset (maxmzaton) Maxmum Lkelhood (ML) Estmaton: Dataset X Statstcal model p(x;θ) (θ parameters)

More information

International Journal of Mathematical Archive-3(3), 2012, Page: Available online through ISSN

International Journal of Mathematical Archive-3(3), 2012, Page: Available online through   ISSN Internatonal Journal of Mathematcal Archve-3(3), 2012, Page: 1136-1140 Avalable onlne through www.ma.nfo ISSN 2229 5046 ARITHMETIC OPERATIONS OF FOCAL ELEMENTS AND THEIR CORRESPONDING BASIC PROBABILITY

More information