Probabilistic Reasoning

Size: px
Start display at page:

Download "Probabilistic Reasoning"

Transcription

1 Probablstc Reasonng (Probablstsch Redeneren) authors: Lnda van der Gaag Slja Renooj Fall 2013

2

3 Preface In artfcal-ntellgence research, the probablstc-network, or (Bayesan) belef-network framework for automated reasonng wth uncertanty s rapdly ganng n popularty. The framework provdes a powerful formalsm for representng a jont probablty dstrbuton on a set of statstcal varables. In addton, t offers algorthms for effcent probablstc nference. At present, more and more knowledge-based systems employng the framework are beng developed for varous domans of applcaton, rangng from probablstc nformaton retreval to medcal dagnoss. Ths syllabus provdes a tutoral ntroducton to the probablstc-network framework and hghlghts some ssues of ongong research n applyng the framework for problem solvng n real-lfe domans. Each chapter ncludes a number of exercses, and answers or hnts to some of them (ndcated by a *) are provded at the end. Ths syllabus was frst wrtten n the late 1990s by L.C. van der Gaag and has been contnuously under development eversnce. Snce 2001, adaptons and extensons have been made mostly by S. Renooj. The syllabus s by no means devod from mperfectons and any useful comments on ts contents are greatly apprecated by the authors. For the 2006 edton, several references to relevant recent research have been added to Chapters 4, 5, and 6. In addton, materal on the subject of senstvty analyss has been extended. For the 2009 edton, some small errors were corrected and relevant references were added. Smlar updates were done for the 2013 edton, for whch also an example n Chapter 6 was extended. Lnda van der Gaag Slja Renooj Utrecht Unversty July 2013 c All rghts reserved. No part of ths work may be reproduced wthout permsson of the authors. 1

4 2

5 Contents 1 Introducton 5 2 Prelmnares Graph Theory Probablty Theory Independences and Graphcal Representatons The Concept of Independence Revsted Pearl s Axomatc System for Independence Propertes of Independence Relatons Graphcal Representatons of Independence Undrected Graphs Drected Graphs Choosng a Graphcal Representaton The Probablstc Network Framework The Probablstc Network Formalsm Probablstc Inference Drected Trees Sngly Connected Dgraphs Multply Connected Dgraphs Other Algorthms for Probablstc Inference Buldng a Probablstc Network Identfyng Varables and Values Constructng the Dgraph Constructng the Dgraph by Hand Learnng the Dgraph from Data Assessng Probabltes Sources of Probablstc Informaton Smplfyng Probablty Assessment Elctng Probabltes from Experts A Procedure for Probablty Refnement Brngng Probablstc Networks nto Practce Senstvty Analyss What to Analyse? One-way Senstvty Analyss

6 Two-way Senstvty Analyss Evaluatng Probablstc Networks The Percentage Correct and ts Shortcomngs The evaluaton score A Problem-Solvng Archtecture Threshold Decson Makng Selectve Evdence Gatherng Conclusons Solutons, Answers and Hnts 141

7 Chapter 1 Introducton Ths chapter gves some hstorcal background about the use of probablty theory and other uncertanty formalsms for reasons of decson support. It brefly motvates the emergence of Probablstc networks (or: (Bayesan) belef networks) and the reason why probablstc network applcatons hstorcally have often concerned the medcal doman. Over the past few decades nterest n the results of artfcal-ntellgence research has been growng to an ncreasng extent. Especally the area of knowledge-based systems has attracted much attenton. The phrase knowledge-based system, or expert system, s generally employed to denote computer systems n whch some symbolc representaton of human knowledge s ncorporated and appled [Lucas & Van der Gaag, 1991, Jackson, 1990]. Knowledge-based systems are typcally desgned to deal wth real-lfe problems that requre consderable human knowledge and expertse for ther soluton; examples range from medcal dagnoss and techncal trouble shootng to fnancal advce and product desgn. It s ther ablty to capture and reason wth (specalsed) human knowledge that allows knowledge-based systems to arrve at a performance comparable to that of human experts. Knowledge-based systems by now have found ther way from academc laboratores to the ndustral world and are beng ntegrated nto conventonal software envronments. As more and more knowledge-based systems are beng developed for a large varety of problems, t becomes apparent that the knowledge requred to solve these problems often s not precsely defned but nstead s of an mprecse nature. In fact, many reallfe problem domans are fraught wth uncertanty. Human experts n these domans typcally are able to form judgements and take decsons based on uncertan, ncomplete, and sometmes even contradctory nformaton. To be of practcal use, a knowledgebased system has to deal wth such nformaton at least equally well. For ths purpose, a knowledge-based system employs a formalsm for representng uncertanty and an assocated algorthm for manpulatng uncertan nformaton. The major research topc n artfcal ntellgence of reasonng wth uncertanty, or plausble reasonng, addresses the desgn of such formalsms and algorthms [Shafer & Pearl, 1990]. As probablty theory s a mathematcally well-founded theory about uncertanty, havng a long and outstandng tradton of research and experence, t s not surprsng that ths theory takes a promnent place n research on reasonng wth uncertanty n knowledge-based systems. Unfortunately, applyng probablty theory n a knowledgebased context s not as easy as t may seem at frst sght. Straghtforward applcaton 5

8 6 1. Introducton of the basc concepts from probablty theory leads to nsuperable problems of computatonal complexty: explct representaton of a jont probablty dstrbuton requres exponental space (exponental n the number of varables dscerned), and even f the dstrbuton could be represented more economcally, computng probabltes of nterest by the basc rules of margnalsaton and condtonng would have an exponental tme complexty. The rch hstory of applyng probablty theory for reasonng wth uncertanty n knowledge-based systems shows varous attempts to settle these problems. In ths chapter, we sketch the hstorcal background of applyng probablty theory n a knowledge-based system. We would lke to note that our ntenton s not to be complete, but merely to gve an mpresson of the problems encountered by researchers poneerng n automated probablstc nference. In our sketch, we focus on the task of (medcal) dagnoss. For a gven problem doman, we dscern a set of possble hypotheses H = {h 1,...,h n }, n 1, and a set of peces of evdence E = {e 1,...,e m }, m 1, that may be observed n relaton wth these hypotheses. For ease of exposton, we assume that each of the hypotheses s ether true or false; equally, we assume that each of the peces of evdence s ether true or false. A dagnostc problem n ths doman now s a set of peces of evdence e E that s actually observed and needs to be explaned n terms of the hypotheses dscerned. A dagnoss for a problem e under consderaton s a set of hypotheses h H that best explans e. As early as n the 1960s several research efforts on automated reasonng wth uncertanty for dagnostc applcatons were undertaken [Warner et al., 1961, Gorry & Barnett, 1968, De Dombal et al., 1972]. The systems constructed n ths perod were based to a large extent on applcaton of Bayes Theorem; n the sequel, we wll refer to the approach taken n these early systems as the nave-bayesan approach. In ths approach, the basc dea of computng a dagnoss for a set of actually observed peces of evdence e E s to compute for all sets of hypotheses h H the condtonal probablty Pr(h e) from the dstrbuton Pr on the doman at hand, and then select a set h H wth hghest probablty. Snce for real-lfe applcatons the condtonal probabltes Pr(e h) often are easer to come by than the condtonal probabltes Pr(h e), generally Bayes Theorem s used for computng the requred probabltes: Pr(h e) = Pr(e h) Pr(h) Pr(e) It wll be evdent that ths approach s qute expensve from a computatonal pont of vew: because a dagnoss may be composed of several dfferent hypotheses, the number of probabltes to be computed equals 2 n 1. To overcome ths problem of tme complexty, a smplfyng assumpton s made: t s assumed that all hypotheses are mutually exclusve and collectvely exhaustve. Wth ths assumpton only the n sngleton hypothess sets {h } have to be consdered as possble dagnoses. So, only the probabltes Pr(h e) (wrtng h nstead of {h }) for all h H have to be computed. To ths end, once more Bayes Theorem s used: Pr(h e) = Pr(e h ) Pr(h ) Pr(e) = Pr(e h ) Pr(h ) n k=1 Pr(e h k) Pr(h k ) For automated applcaton of Bayes Theorem n ths form, several probabltes are requred from the jont probablty dstrbuton Pr on the doman at hand. In fact, condtonal probabltes Pr(e h k ), k = 1,...,n, for every combnaton of peces of

9 1. Introducton 7 evdence e E, have to be avalable. Apart from the fact that t s hardly lkely that these probabltes wll be readly avalable n a real-lfe problem doman, ths means storng exponentally many probabltes. To overcome ths problem of space complexty, a second smplfyng assumpton s made: t s assumed that all peces of evdence are condtonally ndependent gven any of the hypotheses dscerned. The two smplfyng assumptons taken together allow for computng the probabltes Pr(h e) for all h H gven observed evdence e = {e j1,...,e jp }, 1 p m, from Pr(h e j1 e jp ) = Pr(e j1 e jp h ) Pr(h ) n k=1 Pr(e j 1 e jp h k ) Pr(h k ) = = Pr(e j1 h ) Pr(e jp h ) Pr(h ) n k=1 Pr(e j 1 h k ) Pr(e jp h k ) Pr(h k ) It wll be evdent that for any dagnostc problem e now only n 1 probabltes have to be computed, and that for ths purpose only m n condtonal probabltes and n 1 pror ones have to be stored. The systems for automated reasonng wth uncertanty constructed n the 1960s were rather small-scaled: they were devsed for clear-cut problem domans wth only a small number of hypotheses and restrcted evdence. For these small systems, all probabltes necessary for applyng Bayes Theorem could be acqured from statstcal analyss of emprcal data. Despte the underlyng (over-)smplfyng assumptons, these systems performed consderably well [De Dombal et al., 1974]. Nevertheless, nterest n ths nave Bayesan approach to reasonng wth uncertanty faded n the late 1960s and early 1970s. One of the reasons for ths declne n nterest s that the approach was feasble only for hghly restrcted problem domans. For larger or more complex domans, the above-mentoned smplfyng assumptons often were serously volated, causng degeneraton of system behavour. In addton, for larger domans the approach nevtably became demandng, ether computatonally or from an assessment pont of vew. At ths stage, the frst dagnostc knowledge-based systems began to emerge from artfcal-ntellgence research. These systems mostly use producton rules for representng human (experental) knowledge n a modular form closely resemblng logcal mplcatons producton rules are expressons of the form f condton then concluson. These so-called rule-based expert systems exhbt ntellgent reasonng behavour by employng a heurstc reasonng algorthm that use the producton rules for selectve gatherng of evdence and for prunng the search space of possble dagnoses. It s ths prunng behavour that renders the rule-based expert systems capable of dealng wth larger and complexer problem domans than the early nave-bayesan systems are. The best-known rule-based expert system developed n the 1970s s the MYCIN system for assstng physcans n the dagnoss and treatment of bacteral nfectons [Buchanan & Shortlffe, 1984]. In the context of rule-based expert systems, the nave Bayesan approach to reasonng wth uncertanty s no longer feasble due to the large number of probabltes to be computed. Snce n a rule-based system durng problem solvng the search space of possble dagnoses s pruned by heurstc as well as probablstc crtera, t s necessary to compute probabltes for all ntermedate results derved by the producton rules n

10 8 1. Introducton addton to the probabltes of the separate hypotheses. To allow for effcent computaton of all these probabltes, a set of computaton rules has been desgned. These computaton rules provde for computng the probablty of an (ntermedate) result from probabltes assocated wth the producton rules that are used n ts dervaton; to ths end, each producton rule s assgned the condtonal probablty of ts concluson gven ts condton. Unfortunately, these computaton rules do not always accord wth the axoms of probablty theory and can not even be consdered approxmaton rules for computng probabltes. In the sequel, we wll use the phrase quas-probablstc to refer to ths approach. The most well-known llustraton of the quas-probablstc approach s the certanty-factor model, desgned orgnally for dealng wth uncertanty n the MYCIN system [Shortlffe & Buchanan, 1984]. The certanty-factor model enjoys wdespread use n rule-based expert systems bult after MYCIN, even though by now t s wdely known that the model s mathematcally flawed. The relatve success of the model can however be accounted for by ts satsfactory behavour n most applcatons and by ts conceptual and computatonal smplcty [Van der Gaag, 1994]. Although the quas-probablstc approach to reasonng wth uncertanty n knowledge-based systems on the one hand met wth consderable success n the artfcalntellgence communty, t was crtcsed severely on the other hand because of ts adhoc character. The ncorrectness of the approach from a mathematcal pont of vew even led to a world-wde debate concernng the approprateness of probablty theory for handlng uncertanty n a knowledge-based context. The adversares of probablty theory argue that the theory s not expressve enough to cope wth the dfferent knds of uncertanty that are encountered n real-lfe problem domans and therefore have to be dealt wth n knowledge-based systems. As a consequence several other (more or less) mathematcal models have been proposed for reasonng wth uncertanty. A major trend n plausble reasonng has arsen from the clam that probablty theory s not able to capture mprecson or vagueness, notons of uncertanty whch are salent n natural language representatons. The name of L.A. Zadeh s nseparable from ths trend: he was the frst to propose fuzzy set theory as the pont of departure for the development of methods that are able to cope wth vague nformaton. Dempster-Shafer theory les at the bass of another major trend n plausble reasonng. The theory was developed by G. Shafer, buldng on earler work by A.P. Dempster [Shafer, 1976]. It was motvated by the observaton that probablty theory s not able to dscern between uncertanty and gnorance due to ncompleteness of nformaton. The advocates of probablty theory, on the other hand, clam that t s provable that probablty theory s the only correct way of dealng wth uncertanty and that anythng that can be done wth non-probablstc methods, can be done equally well usng a probablty-based method. For ths clam often an argument by R.T. Cox s cted [Cox, 1979]: Cox states a smple set of ntutve propertes a measure of uncertanty has to satsfy and subsequently shows that the basc axoms of probablty theory follow. Here, we wll not enter nto the debate concernng the approprateness of probablty theory for reasonng wth uncertanty n knowledge-based systems; for a wde range of dvergng opnons, the reader s referred to [Cheeseman, 1988] wth ts ensung dscussons. Although the above-mentoned debate was not n the least subdued, n the md- 1980s the probablstc network framework was ntroduced as a novel approach to applyng probablty theory for reasonng wth uncertanty n knowledge-based systems

11 1. Introducton 9 [Pearl, 1988]. The probablstc network framework s charactersed by a powerful formalsm for representng doman knowledge and the uncertantes that go wth t more n specfc, the formalsm provdes for a concse representaton of a jont probablty dstrbuton on a set of statstcal varables. Assocated wth ths formalsm are algorthms for effcently computng probabltes of nterest and for processng evdence; these algorthms consttute the basc buldng blocks for reasonng wth knowledge represented n the formalsm. When compared to the nave-bayesan approach on the one hand and the quas-probablstc approach on the other hand, the probablstc network approach offers advantages over both. In contrast wth the quas-probablstc approach, the probablstc network approach has a frm mathematcal foundaton n probablty theory. Contrastng the nave-bayesan approach, the probablstc network approach crcumvents the need for smplfyng assumptons by capturng and reasonng about actual ndependences among varables. Snce ts ntroducton, the probablstc network framework has rapdly ganed n popularty and by now s begnnng to llustrate ts worth n complex problem domans: practcal applcatons have been and are beng developed for example for medcal dagnoss and prognoss [Andreassen et al., 1987, Heckerman et al., 1992, Blanco et al., 2005], for probablstc nformaton retreval [Bruza & Van der Gaag, 1994], n computer vson [Jensen et al., 1990], n forensc scence [Taron et al., 2006] and varous other domans (see [Pourret, Nam & Marcot, 2008]). Whereas earler applcatons of probablstc networks were mostly handcrafted wth the help of doman experts, the ncreasng avalablty of large data sets has made t much easer to construct applcatons drectly from data [Neapoltan, 2003]. Even large data sets, however, usually do not contan suffcent relable nformaton to construct relable networks of general topology; for ths reason, network engneers ether resort to usng varous types of classfer [Fredman et al., 1997], or agan have to rely on doman expertse to complete the network [Druzdzel & Van der Gaag, 2000]. Ths syllabus provde a tutoral ntroducton to the probablstc network framework and hghlght some ssues of ongong research n applyng the framework for real-lfe problem solvng. It s organsed as follows. Chapter 2 provdes some prelmnares from graph theory and from probablty theory. In Chapter 3, we dscuss the representaton of probablstc ndependence n graphcal models. Chapter 4 ntroduces the probablstc network framework: t detals the probablstc network formalsm and outlnes ts assocated algorthms. In Chapter 5 we address buldng probablstc networks for real-lfe problem domans. Analyss of and problem solvng wth probablstc networks s the topc of Chapter 6. The syllabus s rounded off wth some conclusve dscusson n Chapter 7.

12 10 1. Introducton

13 Chapter 2 Prelmnares Ths chapter refreshes the necessary concepts from graph- and probablty theory that play a central role n the probablstc network framework. These concepts can be found n any textbook on graph theory and probablty theory. The chapter also ntroduces the notaton that s used throughout the syllabus. 2.1 Graph Theory In ths secton, some concepts from graph theory are revewed. Our revew s talored to the probablstc network framework and s not meant to be exhaustve; for further nformaton, any ntroductory textbook on graph theory wll suffce. Generally two types of graph are dscerned: undrected and drected ones. Defnton An undrected graph G s a par G = (V (G), E(G)) where V (G) s a fnte set of vertces (also called nodes) and E(G) s a set of unordered pars (V,V j ), V,V j V (G), called edges. A drected graph, or dgraph for short, s a par G = (V (G),A(G)) where V (G) s a fnte set of vertces and A(G) s a set of ordered pars (V,V j ), V,V j V (G), called arcs. An arc (V,V j ) s often wrtten as V V j or V j V. For a vertex n a graph, dfferent sets of related vertces can be dentfed. Defnton In a dgraph G, vertex V j s called a predecessor (or parent) of vertex V f (V j,v ) A(G); the set of all predecessors of vertex V n G s denoted as ρ G (V ). Lkewse, vertex V j s called a successor (or chld) of vertex V f (V,V j ) A(G); the set of all successors of vertex V n G s denoted as σ G (V ). The reflexve transtve closure 1 of V under the predecessor relaton s denoted as ρ G (V ); an element from ρ G (V ) s called an ancestor of V. Smlarly, σ G (V ) denotes the descendants of V. The set of neghbours of vertex V s defned as { σg (V ν G (V ) = ) ρ G (V ) f G s drected; {V j (V,V j ) E(G)} f G s undrected 1 The reflexve closure of set A under r s r 0 (A) = A, the transtve closure s r + (A) = r(a) r + (r(a)), and both combned gves r (A) = r 0 (A) r + (A). 11

14 12 2. Prelmnares The sze of the neghbour-set of a vertex s called ts degree. In case of a vertex n a dgraph, we n addton defne the n-degree to be ts number of predecessors and the out-degree to be the number of ts successors; the ncomng and outgong arcs together are called ts ncdent arcs. In the sequel, we wll often drop the subscrpt G from ρ G etc. as long as ambguty cannot occur. The followng defnton ntroduces several types of vertex sequence for undrected graphs. Defnton Let G = (V (G),E(G)) be an undrected graph. A path from vertex V 0 to vertex V k n G s a sequence of vertces V 0,...,V k, k 0, wth dstnct edges (V 1,V ) E(G), = 1,...,k, between them; k s called the length of the path. A path s called smple f all vertces are dstnct. A cycle s a path V 0,...,V k,v 0 from V 0 to V 0 of non-zero length. The graph G s called cyclc f t contans at least one cycle; otherwse, t s called acyclc. In undrected graphs, self-loops (an edge (V 0,V 0 )) are generally not allowed. The concepts of path and cycle ntroduced for undrected graphs drectly apply to drected graphs by consderng arcs rather than edges. Unless stated otherwse, we typcally assume paths to be smple. We now ntroduce the concept of underlyng graph. Ths concept assocates an undrected graph wth a drected one. We thereby assume that drected graphs do not contan self-loops ether, although ths s not a conventon. Defnton Let G = (V (G), A(G)) be a dgraph. The underlyng graph H of G s the undrected graph H = (V (H),E(H)) where V (H) = V (G) and E(H) s obtaned from A(G) by replacng each arc (V,V j ) A(G) by an edge (V,V j ). Related to a dgraph s underlyng graph, we ntroduce two addtonal types of vertex sequence for dgraphs. Defnton Let G be a dgraph and let H be ts underlyng graph. A chan from vertex V 0 to vertex V k n G s a sequence of vertces V 0,...,V k, k 0, that s a path n the underlyng graph H of G; k s called the length of the chan. A loop n G s a sequence of vertces that s a cycle n the underlyng graph H of G. Note that, n a dgraph, the concept of path takes the drectons of the arcs nto account, whle the concept of chan does not. A dgraph s therefore acyclc f t contans no drected cycles; an acyclc dgraph (or DAG) can contan loops, however. In a drected graph, two vertces may be connected by a chan. If ths property holds for any two vertces n a dgraph, we say that the graph s connected. Defnton A dgraph G s connected f there exsts at least one chan between any two vertces n G; otherwse, t s called unconnected. We have ntroduced the concept of connectedness to apply to drected graphs; the concept, however, s easly extended to apply to undrected graphs. We now dstngush between several types of dgraph.

15 2.2 Probablty Theory 13 Defnton A dgraph G s called sngly connected f t does not contan any loops; otherwse, t s called multply connected. A sngly connected dgraph G s called a drected tree f each vertex n G has at most one predecessor. Note that n a sngly connected dgraph, there s at most one chan between any two vertces; ths property does not hold for multply connected dgraphs. To conclude, we ntroduce the concept of a subgraph. The concept s ntroduced for undrected graphs, but s extended straghtforwardly to apply to dgraphs. Defnton Let G = (V (G), E(G)) be an undrected graph. The subgraph H nduced by V V (G) s the undrected graph H = (V,(V V ) E(G)). Note that a subgraph nduced by a set of varables V takes from the orgnal graph all edges exstent among the vertces from V. 2.2 Probablty Theory In ths secton, we provde a bref revew of some basc concepts from probablty theory. Once more, our revew s talored to the probablstc network framework and s not meant to be an exhaustve tutoral; for further nformaton, any ntroductory textbook on probablty theory wll suffce. Probablty theory s often approached from a set-theoretc pont of vew. Probablty dstrbutons are then defned on sets of elements that represent events. All possble outcomes of an experment (for example, the possble outcomes of rollng a de) are gven by the sample space Ω and each event A s a subset of Ω. A probablty measure/functon/dstrbuton then s a functon from events to the [0, 1] nterval. As events are sets, combnatons of events can be expressed usng operatons on sets such as unon ( ) and ntersecton ( ). Outcomes of an experment are often coded by usng a random varable (also: statstcal/ stochastc varable) whch s a functon from the sample space to another space (such as reals). By wrtng probablty dstrbutons on statstcal varables, the notaton suppresses references to the actual sample space. However, as a statement about a statstcal varable defnes an event, there s no actual dfference. In the Probablstc Network communty statstcal varables are taken to be functons from Ω to Ω. For a varable V defned on outcomes true and false, for example, we therefore have V (true) = true and V (false) = false. We now smply say that V can have or take on one of the values true and false, n whch case we wrte V = true or V = false as possble value-assgnments. The Probablstc Network communty approaches probablty theory from an algebrac pont of vew by assocatng probabltes wth logcal propostons nstead of sets. In ths syllabus, we consder a set of statstcal varables V = {V 1,...,V n }, n 1. For ease of exposton, we wll often restrct the dscusson to bnary varables takng one of the truth values true and false; the generalsaton to varables wth more than two dscrete values, however, s rather straghtforward. For abbrevaton, we wll use v to denote the proposton that the varable V takes the value true; V = false wll be denoted as v. The set of varables V may be looked upon as spannng a Boolean algebra of propostons V. Informally speakng, ths algebra comprses all logcal propostons that are bult from value assgnments to the varables dscerned. More formally,

16 14 2. Prelmnares the Boolean algebra of propostons V spanned by V s the set of logcal propostons consstng of the constant propostons True and False 2, the atomc proposton v for all V V, and all compound propostons that are constructed from these by applyng the bnary operators (conjuncton), (dsjuncton), and the unary operator (negaton); the elements of the algebra V adhere to the usual axoms of propostonal logc. We now defne a jont probablty dstrbuton as a functon on a Boolean algebra of propostons that s spanned by a set of statstcal varables. Defnton Let V be a set of statstcal varables and let V be the Boolean algebra of propostons spanned by V. Let Pr : V [0,1] be a functon such that Pr(x) 0, for all x V, and Pr(False) = 0, more n specfc; Pr(True) = 1; Pr(x y) = Pr(x) + Pr(y), for all x,y V such that x y False. Then, Pr s called a jont probablty dstrbuton on V. For each x V, the functon value Pr(x) s termed the probablty of x. A probablty Pr(x) for a logcal proposton x expresses the amount of certanty concernng the truth of x. Note that n the prevous defnton we have assocated probabltes wth logcal propostons nstead of wth sets, whch s the more common vew taken n (ntroductory) lterature on probablty theory. It can easly be shown, however, that the probablty of an event (a set of outcomes) s equvalent to the probablty of the truth of the proposton assertng the occurrence of the event [Fnett, 1970]. Example Suppose X and Y are statstcal varables representng a con toss. Let A be the event that X = heads and Y = tals then the probablty of ths event s from a set-theoretc pont of vew: the probablty of event A, wrtten Pr(A); from an algebrac pont of vew: the probablty that (X = heads and Y = tals) True, wrtten Pr(X = heads Y = tals) If X = heads and Y = tals were consdered two separate events A and B, then ths would make no dfference from the algebrac pont of vew, but n the set-theoretc approach we should now wrte Pr(A B). In the sequel, we wll want to sngle out strctly postve jont probablty dstrbutons as these have some nterestng propertes. Strctly postve dstrbutons for example are well-known for ther not embeddng any functonal or logcal relatonshps among ther varables. Defnton Let V be a set of statstcal varables and let V be the Boolean algebra of propostons spanned by V. Let Pr be a jont probablty dstrbuton on V. Pr s strctly postve f Pr(x) = 0 mples x False. We now ntroduce the concept of condtonal probablty. 2 Note the dfference between these propostons and the afore mentoned outcomes/values!

17 2.2 Probablty Theory 15 Defnton Let V be a set of statstcal varables and let V be the Boolean algebra of propostons spanned by V. Let Pr be a jont probablty dstrbuton on V. For each x,y V wth Pr(y) > 0, the condtonal probablty of x gven y, denoted as Pr(x y), s defned as Pr(x y) = Pr(x y) Pr(y) The condtonal probablty Pr(x y) expresses the amount of certanty concernng the truth of x gven that the nformaton y s known wth certanty. Note that a condtonal probablty Pr(x y) = p does not mean that whenever y s known to be true, the probablty of x equals p: t means that the probablty of x equals p f y s known to be true and nothng else s known that may affect the certanty concernng the truth of x. In the sequel, we wll assume that all condtonal probabltes specfed are properly defned, that s, for each condtonal probablty Pr(x y), we wll mplctly assume that Pr(y) > 0. We further state wthout proof that for a gven element y V, the condtonal probabltes Pr(x y) for all x V once more consttute a jont probablty dstrbuton on V ; ths probablty dstrbuton s called the condtonal probablty dstrbuton gven y and wll sometmes be denoted as Pr y. A condtonal probablty dstrbuton s sometmes referred to as a posteror probablty dstrbuton; the jont probablty dstrbuton t s obtaned from then n contrast s referred to as the pror dstrbuton. The followng defnton ntroduces the concept of ndependence of propostons. Defnton Let V be a set of statstcal varables and let V be the Boolean algebra of propostons spanned by V. Let Pr be a jont probablty dstrbuton on V. Then, two propostons x,y V are called (mutually) ndependent n Pr f Pr(x y) = Pr(x) Pr(y) otherwse, x and y are called dependent n Pr. Two propostons x,y V are called condtonally ndependent gven the proposton z V n Pr f Pr(x y z) = Pr(x z) Pr(y z) otherwse, x and y are called condtonally dependent gven z n Pr. In the sequel, we wll make extensve use of varous well-known propertes of jont probablty dstrbutons. Before statng these propertes, we provde some addtonal concepts and notatonal conventons. Recall that so far we have bult on the Boolean algebra of propostons V spanned by some set of statstcal varables V. In the sequel, we wll want to focus on the varables themselves and to refer to (arbtrary) conjunctons of value assgnments to all varables from the set V or from some subset of V. We wll use C W to denote the conjuncton C W = V W V of all varables from W V ; for W =, we take C W = True. The conjuncton of varables C W s called the confguraton template of W. A conjuncton c W of value assgnments to the varables from W s called a confguraton of W. A confguraton c W s nothng more than a short-hand notaton to ndcate that you are consderng a proposton that conssts of the conjuncton of atomc propostons representng some value assgnment to each varable n the set W. Wrtng c W, W = {W 1,...W m }, nstead of W 1 = some value W 2 = some value... W m = some value s very convenent, especally f you don t care about the actual

18 16 2. Prelmnares values and varables. Note that a confguraton template C W s a short-hand notaton for an even more general statement, namely about all possble value assgnments to the varables n W: any confguraton c W of W of nterest can be obtaned by fllng n approprate values for all varables nvolved. To avod abundance of braces, we wll often wrte C V and c V nstead of C {V } and c {V }, respectvely, for sngleton sets {V }. Please note that for a sngle vertex V we have that C {V } = V and that Pr(V ) has therefore an entrely dfferent meanng than t would from a set-theoretc pont of vew! We now state the varous propertes that we wll use n the sequel. We would lke to note that n the lterature on probablty theory these propertes are ntroduced n many dfferent appearances; we have chosen the form that suts our purposes best. The property stated n the followng proposton s known as the chan rule. Proposton Let V = {V 1,...,V n }, n 1, be a set of statstcal varables and let Pr be a jont probablty dstrbuton on V. Then, Pr(C V ) = Pr(V 1 V n ) = Pr(V n V 1 V n 1 )... Pr(V 2 V 1 ) Pr(V 1 ) In the expresson stated n the prevous proposton, each V s a varable that takes ether the value true or the value false, expressed as v and v, respectvely. The expresson therefore represents 2 n equaltes, one for each confguraton of the set of varables V. The property stated n the followng proposton s termed the margnalsaton property. Proposton Let V be a set of statstcal varables and let Pr be a jont probablty dstrbuton on V. Then, Pr(C X ) = c Y Pr(C X c Y ) for all sets of varables X,Y V. We state wthout proof that for any set of varables X V, the probabltes Pr(c X ) for all confguratons c X of X once more consttute a jont probablty dstrbuton; ths probablty dstrbuton s termed the margnal probablty dstrbuton on X. The condtonng property s stated n the followng proposton. Proposton Let V be a set of statstcal varables and let Pr be a jont probablty dstrbuton on V. Then, Pr(C X ) = c Y Pr(C X c Y ) Pr(c Y ) for all sets of varables X,Y V. The followng theorem s known as Bayes Theorem. Theorem Let V be a set of statstcal varables and let Pr be a jont probablty dstrbuton on V. Then, Pr(C X C Y ) = Pr(C Y C X ) Pr(C X ) Pr(C Y ) for all sets of varables X,Y V.

19 Exercses 2 17 To conclude ths secton, we once more turn to the concept of (condtonal) ndependence. Recall that so far we have taken the concept of ndependence to apply to propostons. We now ntroduce the concept of ndependence of varables. Defnton Let V be a set of statstcal varables and let X,Y,Z V. Let Pr be a jont probablty dstrbuton on V. Then, the set of varables X s called condtonally ndependent of the set of varables Y gven the set of varables Z n Pr f Pr(C X C Y C Z ) = Pr(C X C Z ) otherwse, X s called condtonally dependent of Y gven Z n Pr. In qualtatve terms, the expresson Pr(C X C Y C Z ) = Pr(C X C Z ) ndcates that, once nformaton about Z s avalable, nformaton about Y s rrelevant wth respect to X. Note that for X and Y to be ndependent gven Z, every par of confguratons of X and Y has to be ndependent gven every confguraton of Z. Independence of varables therefore mples ndependence of propostons. The reverse property, however, does not hold n general. Also note that the expresson from Defnton s asymmetrc n X and Y. Usng Bayes Theorem, however, t s easly shown that Pr(C X C Y C Z ) = Pr(C X C Z ) mples Pr(C Y C X C Z ) = Pr(C Y C Z ). Exercses Exercse 2.1 Prove the followng propertes for any jont probablty dstrbuton, usng only defntons and not the propertes from ths exercse: a. the chan rule (stated n Proposton 2.2.6); b. Bayes Theorem (stated n Theorem 2.2.9); c. the margnalsaton property (stated n Proposton 2.2.7); * d. the condtonng property (stated n Proposton 2.2.8). Exercse 2.2 Let V be a set of statstcal varables and let Pr be a jont probablty dstrbuton on V. Show that Pr(C X C Y ) = Pr(C X ) + Pr(C Y ) Pr(C X C Y ) for all sets of varables X,Y V. Exercse 2.3 Let V be a set of statstcal varables and let Pr be a jont probablty dstrbuton on V. Show that Pr(C X C Z ) = c Y Pr(C X c Y C Z ) Pr(c Y C Z ) for all sets of varables X,Y,Z V.

20 18 Exercses 2 * Exercse 2.4 Let V be a set of statstcal varables and let X,Y,Z V. Let Pr be a jont probablty dstrbuton on V. Show that the set of varables X s condtonally ndependent of the set of varables Y gven the set of varables Z f and only f Pr(C X C Y C Z ) = Pr(C X C Z ) Pr(C Y C Z ).

21 Chapter 3 Independences and Graphcal Representatons Ths chapter formalses two types of ndependence relaton. The frst, I Pr, s the type of relaton that can be captured by a probablty dstrbuton. These ndependence relatons form a proper subset of a more general type of ndependence relaton I that abstracts away from probablty dstrbutons. The chapter also dscusses dfferent representatons of ndependence relatons, most notably (n)drected graphs. An mportant noton ntroduced n ths chapter s the concept of d-separaton. Current research nto ndependence relatons s focussed on defnng small generatng sets [Waal & Van der Gaag, 2005], and on automated constructon of graphcal representatons from them [Baolett et al., 2011]. The hstorcal background to the framework of probablstc networks shows varous attempts to handle the computatonal complexty of applyng probablty theory for reasonng wth uncertanty n knowledge-based systems. The concept of (condtonal) ndependence plays a key role n these attempts as knowledge about ndependences allows for smplfyng computatons. In ths chapter, we address formalsms that allow for a concse representaton of an ndependence relaton for effectve use n a knowledgebased system. 3.1 The Concept of Independence Revsted In most ntroductory lterature on probablty theory, the concept of (condtonal) ndependence s ntroduced n terms of numercal quanttes: the ndependence relaton of a jont probablty dstrbuton s taken to be mplctly embedded n the probabltes nvolved. Recall for example that n the prevous chapter we have defned two sets of varables X and Y to be condtonally ndependent gven a thrd set of varables Z f Pr(C X C Y C Z ) = Pr(C X C Z ). A defnton of ndependence n terms of numbers suggests that, n order to determne whether two sets of varables are (condtonally) ndependent, several condtonal probabltes have to be computed and several equaltes have to be tested; moreover, such a defnton suggests that for determnng ndependence a jont probablty dstrbuton has to be explctly avalable for the varables dscerned. In contrast, humans tend to be able to state drectly, wth convcton and consstency, whether or not two sets of varables are ndependent. Such statements 19

22 20 3. Independences and Graphcal Representatons of ndependence typcally are ssued qualtatvely, wthout any reference to numercal manpulaton of exact probabltes. Based on these observatons, we cannot but conclude that the concept of ndependence s far more basc to human reasonng than ts numercal defnton suggests. In fact, the defnton of ndependence n terms of probabltes may be looked upon as a quanttatve way of capturng the basc concept whch s qualtatve n nature. To formalse propertes of the qualtatve concept of ndependence, J. Pearl and hs co-researchers have desgned an axomatc system for ndependence [Pearl & Paz, 1985, Pearl & Verma, 1987, Geger & Pearl, 1988]. In ths secton, we revew ths axomatc system Pearl s Axomatc System for Independence We begn our revew of Pearl s axomatc system for ndependence by ntroducng some new termnology and notatonal conventon. Defnton Let V be a set of statstcal varables and let Pr be a jont probablty dstrbuton on V. Then, the ndependence relaton I Pr P(V ) P(V ) P(V ) of Pr s defned by (X,Z,Y ) I Pr f and only f Pr(C X C Y C Z ) = Pr(C X C Z ) for all sets of varables X,Y,Z V. In the sequel, we wll wrte I Pr (X,Z,Y ) to denote (X,Z,Y ) I Pr and I Pr (X,Z,Y ) to denote (X,Z,Y ) I Pr. A statement I Pr (X,Z,Y ) of a jont probablty dstrbuton s ndependence relaton I Pr s termed an ndependence statement. In qualtatve terms, an ndependence statement I Pr (X,Z,Y ) expresses that n the context of nformaton about Z nformaton about Y s rrelevant wth respect to X. We note that the above defnton allows for statng some trval but convenent ndependence statements, such as I Pr (X,X,Y ), whch holds ff Pr(C X C Y C X ) = Pr(C X C X ),.e. 1 = 1; ts symmetrc verson I Pr (Y,X,X) s also trvally true snce I Pr (Y,X,X) ff Pr(C Y C X C X ) = Pr(C Y C X ). In desgnng hs axomatc system for ndependence, Pearl bulds on a set of propertes that are satsfed by any jont probablty dstrbuton s ndependence relaton; Theorem revews these propertes. Theorem Let V be a set of statstcal varables. Let Pr be a jont probablty dstrbuton on V and let I Pr be ts ndependence relaton. Then, I Pr satsfes the propertes I Pr (X,Z,Y ) I Pr (Y,Z,X); I Pr (X,Z,Y W) I Pr (X,Z,Y ) I Pr (X,Z,W); I Pr (X,Z,Y W) I Pr (X,Z W,Y ); I Pr (X,Z,Y ) I Pr (X,Z Y,W) I Pr (X,Z,Y W); (symmetry) (decomposton) (weak unon) (contracton) for all mutually dsjont sets of varables X,Y,Z,W V. If the dstrbuton Pr s strctly postve, then I Pr satsfes the addtonal property I Pr (X,Z W,Y ) I Pr (X,Z Y,W) I Pr (X,Z,Y W); (ntersecton) for all mutually dsjont sets of varables X,Y,Z,W V.

23 3.1 The Concept of Independence Revsted 21 The propertes stated n the prevous theorem are easly verfed from the basc axoms of probablty theory. We would lke to note that we have closely followed Pearl by statng the propertes n the theorem to hold for mutually dsjont sets of varables only [Pearl, 1988]. These propertes, however, also hold for overlappng sets of varables [Van der Gaag & Meyer, 1998]. Pearl now takes the propertes stated n Theorem as axoms for the qualtatve concept of ndependence [Pearl, 1988]. Followng the propertes for I Pr, Pearl assumed for each axom that the sets of varables nvolved are mutually dsjont. Gven the nsght that the propertes also hold for overlappng sets, we wll lft the assumpton of mutual dsjontness n the next and all followng defntons nvolvng ndependence relatons. The followng now defnes nformatonal ndependence: Defnton Let V be a set of statstcal varables. A sem-graphod ndependence relaton on V s a ternary relaton I P(V ) P(V ) P(V ) such that I satsfes the propertes I(X,Z,Y ) I(Y,Z,X); I(X,Z,Y W) I(X,Z,Y ) I(X,Z,W); I(X,Z,Y W) I(X,Z W,Y ); I(X,Z,Y ) I(X,Z Y,W) I(X,Z,Y W); for all sets of varables X,Y,Z,W V. A graphod ndependence relaton I on V s a sem-graphod ndependence relaton on V such that I satsfes the addtonal property I(X,Z W,Y ) I(X,Z Y,W) I(X,Z,Y W); for all sets of varables X, Y, Z, W V. The propertes descrbed n the prevous defnton wth each other convey the dea that learnng rrelevant nformaton does not alter the ndependences among the varables dscerned [Pearl, 1988]. We consder the qualtatve meanngs of the varous propertes separately. The property I(X,Z,Y ) I(Y,Z,X) for all sets of varables X,Y,Z V, states that f nformaton about Y s deemed rrelevant wth respect to X n the context of some nformaton about Z, then nformaton about X must be rrelevant wth respect to Y n ths context; ths property s called the symmetry axom. The property I(X,Z,Y W) I(X,Z,Y ) I(X,Z,W) for all sets of varables X,Y,Z,W V, asserts that f nformaton about both Y and W s judged rrelevant wth respect to X, then both nformaton about Y and nformaton about W must be rrelevant wth respect to X separately; ths property s known as the decomposton axom. We would lke to note that the decomposton axom may be reformulated as I(X, Z, Y W) I(X, Z, Y ); we have chosen, however, to use Pearl s orgnal formulaton because t conveys the dea of decomposton more clearly.

24 22 3. Independences and Graphcal Representatons The property I(X,Z,Y W) I(X,Z W,Y ) for all sets of varables X, Y, Z, W V, states that learnng nformaton about W that s known to be rrelevant wth respect to X cannot help rrelevant nformaton about Y to become relevant wth respect to X; ths property s known as the weak unon axom. The property I(X,Z,Y ) I(X,Z Y,W) I(X,Z,Y W) for all sets of varables X,Y,Z,W V, states that f we judge nformaton about W to be rrelevant wth respect to X after learnng some rrelevant nformaton about Y, then the nformaton about W must have been rrelevant wth respect to X before we learned Y ; ths property s known as the contracton axom. Note that the contracton axom can be reformulated as I(X,Z,Y ) (I(X,Z Y,W) I(X,Z,Y W)). From ths reformulaton, t s seen that the axom can be looked upon as a condtonal reverse of the weak unon axom. We now consder the property I(X,Z W,Y ) I(X,Z Y,W) I(X,Z,Y W) for all sets of varables X,Y,Z,W V, for graphod ndependence relatons. Ths property states that f, n the context of some nformaton about Z, learnng nformaton about W renders nformaton about Y rrelevant wth respect to X and learnng Y renders W rrelevant wth respect to X, then the nformaton about both Y and W must be rrelevant wth respect to X gven Z; ths property s known as the ntersecton axom. From Defnton and Theorem 3.1.2, we observe that every ndependence relaton that s embedded n a jont probablty dstrbuton s a sem-graphod ndependence relaton; ths property s stated more formally n the followng corollary. Corollary Let V be a set of statstcal varables. Let Pr be a jont probablty dstrbuton on V and let I Pr be ts ndependence relaton. Then, I Pr s a sem-graphod ndependence relaton. Furthermore, f Pr s strctly postve, then I Pr s a graphod ndependence relaton. Unfortunately, although any probablty dstrbuton s ndependence relaton s a semgraphod ndependence relaton, the reverse property does not hold. There exst semgraphod ndependence relatons for whch there do not exst jont probablty dstrbutons embeddng them; for detals, we refer to [Van der Gaag & Meyer, 1996, Studený, 1989]. We would lke to note that t has been shown that a fnte axomatsaton of the concept of probablstc ndependence does not exst [Studený, 1992] Propertes of Independence Relatons Usng the defnton of nformatonal ndependence, we derve some convenent propertes of (sem-graphod and graphod) ndependence relatons. The followng lemma shows that the symmetry and contracton axoms are easly generalsed to b-mplcatons.

25 3.1 The Concept of Independence Revsted 23 Lemma Let V be a set of statstcal varables. Furthermore, let I be a semgraphod ndependence relaton on V. Then, I(X,Z,Y ) I(Y,Z,X); I(X,Z,Y ) I(X,Z Y,W) I(X,Z,Y W); for all sets of varables X, Y, Z, W V. Proof. We begn our proof by observng that snce I s a sem-graphod ndependence relaton, t obeys the frst four axoms stated n Defnton The frst property stated n the lemma now follows drectly from the symmetry axom. For the second property, we observe that I(X,Z,Y ) I(X,Z Y,W) I(X,Z,Y W) concdes wth the contracton axom and therefore trvally holds for the relaton I. We wll now prove that I(X,Z,Y W) I(X,Z,Y ) I(X,Z Y,W). We have I(X,Z,Y W) I(X,Z,Y ) I(X,Z,W) I(X,Z,Y ) by the decomposton axom. In addton, we have I(X,Z,Y W) I(X,Z Y,W) by weak unon. The property stated n the lemma now follows drectly. For graphod ndependence relatons we have that the ntersecton axom can also be generalsed to a b-mplcaton. Lemma Let V be a set of statstcal varables. Furthermore, let I be a graphod ndependence relaton on V. Then, I(X,Z W,Y ) I(X,Z Y,W) I(X,Z,Y W) for all sets of varables X, Y, Z, W V. Proof. We wll only prove that I(X,Z,Y W) I(X,Z W,Y ) I(X,Z Y,W); the reverse property concdes wth the ntersecton axom and therefore trvally holds for the ndependence relaton I. We have and I(X,Z,Y W) I(X,Z W,Y ) I(X,Z,Y W) I(X,Z Y,W) by the weak unon axom. The property stated n the lemma now follows drectly. In the sequel, we wll use the phrase ndependence relaton to denote a sem-graphod ndependence relaton, unless stated otherwse.

26 24 3. Independences and Graphcal Representatons 3.2 Graphcal Representatons of Independence One of the problems n applyng probablty theory for automated reasonng wth uncertanty n a knowledge-based system s the space complexty of representng a jont probablty dstrbuton. Snce the concept of ndependence plays a key role n solvng ths problem, a formalsm for representng jont probablty dstrbutons should allow for effcently modelng ndependences. There are varous ways of representng an ndependence relaton. One way, for example, s to enumerate the separate statements of an ndependence relaton explctly. Such a representaton clearly s mpractcal as the number of tuples n an ndependence relaton can be astronomcal. Another way s to make use of the axoms from Defnton 3.1.3: only the statements from an approprate subset of the ndependence relaton are enumerated explctly and all ts other statements are defned mplctly by ths set and the defnng axoms. Although explotng the axoms allows for a far more economcal representaton of an ndependence relaton than explct enumeraton, t can stll requre exponental space. In ths secton, we consder more concse representatons of ndependence relatons, buldng on the dea of graphcal encodng. In Secton we address modelng an ndependence relaton n an undrected graph; n Secton we consder the representaton of an ndependence relaton n the formalsm of drected graphs Undrected Graphs Undrected graphs have no probablstc meanng by themselves. For representng an ndependence relaton n an undrected graph, therefore, a probablstc meanng has to be assgned to the topologcal propertes of such a graph, that s, we have to assgn a probablstc meanng to the vertces of the graph and to assgn a probablstc meanng to ts edges. Informally speakng, we choose to encode an ndependence relaton n an undrected graph by modellng the varables of the relaton as vertces and by representng ts ndependence statements by absence of edges. To formally capture ths meanng, we begn by defnng a graphcal crteron for readng from a graph sets of vertces that allow for blockng all paths between two gven sets of vertces; ths graphcal crteron s termed the separaton crteron for undrected graphs. Defnton Let G = (V (G), E(G)) be an undrected graph. Let X, Y, Z V (G) be sets of vertces n G. The set of vertces Z s sad to separate the sets of vertces X and Y n G, denoted as X Z Y G, f for each vertex V X and each vertex V j Y every smple path from V to V j n G contans at least one vertex from Z. We look upon a separatng set of varables as effectvely blockng nfluence: f a set of varables Z separates two sets of varables X and Y, then Z s looked upon as blockng any flow of nformaton or nfluence between X and Y. Two sets of varables X and Y that are thus separated by a set Z now are taken to be condtonally ndependent gven Z. Defnton Let V be a set of statstcal varables and let I be an ndependence relaton on V. Furthermore, let G = (V (G),E(G)) be an undrected graph wth V (G) = V. Then, the graph G s called an undrected dependence map, or D-map for short, for I, f for all sets of varables X,Y,Z V, we have: f I(X,Z,Y ) then X Z Y G ;

Probabilistic Reasoning

Probabilistic Reasoning Probablstc Reasonng (Probablstsch Redeneren) authors: Lnda van der Gaag Slja Renooj Fall 2016 Preface In artfcal-ntellgence research, the Bayesan network framework for automated reasonng wth uncertanty

More information

Probabilistic Reasoning

Probabilistic Reasoning Probablstc Reasonng (Probablstsch Redeneren) authors: Lnda van der Gaag Slja Renooj Fall 2006 *=Unverstet-Utrecht Preface In artfcal-ntellgence research, the probablstc-network, or (Bayesan) belef-network

More information

The Order Relation and Trace Inequalities for. Hermitian Operators

The Order Relation and Trace Inequalities for. Hermitian Operators Internatonal Mathematcal Forum, Vol 3, 08, no, 507-57 HIKARI Ltd, wwwm-hkarcom https://doorg/0988/mf088055 The Order Relaton and Trace Inequaltes for Hermtan Operators Y Huang School of Informaton Scence

More information

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal Inner Product Defnton 1 () A Eucldean space s a fnte-dmensonal vector space over the reals R, wth an nner product,. Defnton 2 (Inner Product) An nner product, on a real vector space X s a symmetrc, blnear,

More information

Outline. Bayesian Networks: Maximum Likelihood Estimation and Tree Structure Learning. Our Model and Data. Outline

Outline. Bayesian Networks: Maximum Likelihood Estimation and Tree Structure Learning. Our Model and Data. Outline Outlne Bayesan Networks: Maxmum Lkelhood Estmaton and Tree Structure Learnng Huzhen Yu janey.yu@cs.helsnk.f Dept. Computer Scence, Unv. of Helsnk Probablstc Models, Sprng, 200 Notces: I corrected a number

More information

Bayesian Networks. Course: CS40022 Instructor: Dr. Pallab Dasgupta

Bayesian Networks. Course: CS40022 Instructor: Dr. Pallab Dasgupta Bayesan Networks Course: CS40022 Instructor: Dr. Pallab Dasgupta Department of Computer Scence & Engneerng Indan Insttute of Technology Kharagpur Example Burglar alarm at home Farly relable at detectng

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

20. Mon, Oct. 13 What we have done so far corresponds roughly to Chapters 2 & 3 of Lee. Now we turn to Chapter 4. The first idea is connectedness.

20. Mon, Oct. 13 What we have done so far corresponds roughly to Chapters 2 & 3 of Lee. Now we turn to Chapter 4. The first idea is connectedness. 20. Mon, Oct. 13 What we have done so far corresponds roughly to Chapters 2 & 3 of Lee. Now we turn to Chapter 4. The frst dea s connectedness. Essentally, we want to say that a space cannot be decomposed

More information

More metrics on cartesian products

More metrics on cartesian products More metrcs on cartesan products If (X, d ) are metrc spaces for 1 n, then n Secton II4 of the lecture notes we defned three metrcs on X whose underlyng topologes are the product topology The purpose of

More information

FREQUENCY DISTRIBUTIONS Page 1 of The idea of a frequency distribution for sets of observations will be introduced,

FREQUENCY DISTRIBUTIONS Page 1 of The idea of a frequency distribution for sets of observations will be introduced, FREQUENCY DISTRIBUTIONS Page 1 of 6 I. Introducton 1. The dea of a frequency dstrbuton for sets of observatons wll be ntroduced, together wth some of the mechancs for constructng dstrbutons of data. Then

More information

Problem Set 9 Solutions

Problem Set 9 Solutions Desgn and Analyss of Algorthms May 4, 2015 Massachusetts Insttute of Technology 6.046J/18.410J Profs. Erk Demane, Srn Devadas, and Nancy Lynch Problem Set 9 Solutons Problem Set 9 Solutons Ths problem

More information

Module 2. Random Processes. Version 2 ECE IIT, Kharagpur

Module 2. Random Processes. Version 2 ECE IIT, Kharagpur Module Random Processes Lesson 6 Functons of Random Varables After readng ths lesson, ou wll learn about cdf of functon of a random varable. Formula for determnng the pdf of a random varable. Let, X be

More information

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number

More information

Lecture 3: Probability Distributions

Lecture 3: Probability Distributions Lecture 3: Probablty Dstrbutons Random Varables Let us begn by defnng a sample space as a set of outcomes from an experment. We denote ths by S. A random varable s a functon whch maps outcomes nto the

More information

Difference Equations

Difference Equations Dfference Equatons c Jan Vrbk 1 Bascs Suppose a sequence of numbers, say a 0,a 1,a,a 3,... s defned by a certan general relatonshp between, say, three consecutve values of the sequence, e.g. a + +3a +1

More information

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:

More information

Power law and dimension of the maximum value for belief distribution with the max Deng entropy

Power law and dimension of the maximum value for belief distribution with the max Deng entropy Power law and dmenson of the maxmum value for belef dstrbuton wth the max Deng entropy Bngy Kang a, a College of Informaton Engneerng, Northwest A&F Unversty, Yanglng, Shaanx, 712100, Chna. Abstract Deng

More information

Artificial Intelligence Bayesian Networks

Artificial Intelligence Bayesian Networks Artfcal Intellgence Bayesan Networks Adapted from sldes by Tm Fnn and Mare desjardns. Some materal borrowed from Lse Getoor. 1 Outlne Bayesan networks Network structure Condtonal probablty tables Condtonal

More information

Structure and Drive Paul A. Jensen Copyright July 20, 2003

Structure and Drive Paul A. Jensen Copyright July 20, 2003 Structure and Drve Paul A. Jensen Copyrght July 20, 2003 A system s made up of several operatons wth flow passng between them. The structure of the system descrbes the flow paths from nputs to outputs.

More information

International Journal of Mathematical Archive-3(3), 2012, Page: Available online through ISSN

International Journal of Mathematical Archive-3(3), 2012, Page: Available online through   ISSN Internatonal Journal of Mathematcal Archve-3(3), 2012, Page: 1136-1140 Avalable onlne through www.ma.nfo ISSN 2229 5046 ARITHMETIC OPERATIONS OF FOCAL ELEMENTS AND THEIR CORRESPONDING BASIC PROBABILITY

More information

Reasoning under Uncertainty

Reasoning under Uncertainty Reasonng under Uncertanty Course: CS40022 Instructor: Dr. Pallab Dasgupta Department of Computer Scence & Engneerng Indan Insttute of Technology Kharagpur Handlng uncertan knowledge p p Symptom(p, Toothache

More information

Stochastic Structural Dynamics

Stochastic Structural Dynamics Stochastc Structural Dynamcs Lecture-1 Defnton of probablty measure and condtonal probablty Dr C S Manohar Department of Cvl Engneerng Professor of Structural Engneerng Indan Insttute of Scence angalore

More information

Bayesian predictive Configural Frequency Analysis

Bayesian predictive Configural Frequency Analysis Psychologcal Test and Assessment Modelng, Volume 54, 2012 (3), 285-292 Bayesan predctve Confgural Frequency Analyss Eduardo Gutérrez-Peña 1 Abstract Confgural Frequency Analyss s a method for cell-wse

More information

Linear, affine, and convex sets and hulls In the sequel, unless otherwise specified, X will denote a real vector space.

Linear, affine, and convex sets and hulls In the sequel, unless otherwise specified, X will denote a real vector space. Lnear, affne, and convex sets and hulls In the sequel, unless otherwse specfed, X wll denote a real vector space. Lnes and segments. Gven two ponts x, y X, we defne xy = {x + t(y x) : t R} = {(1 t)x +

More information

Graph Reconstruction by Permutations

Graph Reconstruction by Permutations Graph Reconstructon by Permutatons Perre Ille and Wllam Kocay* Insttut de Mathémathques de Lumny CNRS UMR 6206 163 avenue de Lumny, Case 907 13288 Marselle Cedex 9, France e-mal: lle@ml.unv-mrs.fr Computer

More information

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4) I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes

More information

Edge Isoperimetric Inequalities

Edge Isoperimetric Inequalities November 7, 2005 Ross M. Rchardson Edge Isopermetrc Inequaltes 1 Four Questons Recall that n the last lecture we looked at the problem of sopermetrc nequaltes n the hypercube, Q n. Our noton of boundary

More information

Random Walks on Digraphs

Random Walks on Digraphs Random Walks on Dgraphs J. J. P. Veerman October 23, 27 Introducton Let V = {, n} be a vertex set and S a non-negatve row-stochastc matrx (.e. rows sum to ). V and S defne a dgraph G = G(V, S) and a drected

More information

CIS587 - Artificial Intellgence. Bayesian Networks CIS587 - AI. KB for medical diagnosis. Example.

CIS587 - Artificial Intellgence. Bayesian Networks CIS587 - AI. KB for medical diagnosis. Example. CIS587 - Artfcal Intellgence Bayesan Networks KB for medcal dagnoss. Example. We want to buld a KB system for the dagnoss of pneumona. Problem descrpton: Dsease: pneumona Patent symptoms (fndngs, lab tests):

More information

Numerical Heat and Mass Transfer

Numerical Heat and Mass Transfer Master degree n Mechancal Engneerng Numercal Heat and Mass Transfer 06-Fnte-Dfference Method (One-dmensonal, steady state heat conducton) Fausto Arpno f.arpno@uncas.t Introducton Why we use models and

More information

Fuzzy Boundaries of Sample Selection Model

Fuzzy Boundaries of Sample Selection Model Proceedngs of the 9th WSES Internatonal Conference on ppled Mathematcs, Istanbul, Turkey, May 7-9, 006 (pp309-34) Fuzzy Boundares of Sample Selecton Model L. MUHMD SFIIH, NTON BDULBSH KMIL, M. T. BU OSMN

More information

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction ECONOMICS 5* -- NOTE (Summary) ECON 5* -- NOTE The Multple Classcal Lnear Regresson Model (CLRM): Specfcaton and Assumptons. Introducton CLRM stands for the Classcal Lnear Regresson Model. The CLRM s also

More information

A New Evolutionary Computation Based Approach for Learning Bayesian Network

A New Evolutionary Computation Based Approach for Learning Bayesian Network Avalable onlne at www.scencedrect.com Proceda Engneerng 15 (2011) 4026 4030 Advanced n Control Engneerng and Informaton Scence A New Evolutonary Computaton Based Approach for Learnng Bayesan Network Yungang

More information

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix Lectures - Week 4 Matrx norms, Condtonng, Vector Spaces, Lnear Independence, Spannng sets and Bass, Null space and Range of a Matrx Matrx Norms Now we turn to assocatng a number to each matrx. We could

More information

A Bayes Algorithm for the Multitask Pattern Recognition Problem Direct Approach

A Bayes Algorithm for the Multitask Pattern Recognition Problem Direct Approach A Bayes Algorthm for the Multtask Pattern Recognton Problem Drect Approach Edward Puchala Wroclaw Unversty of Technology, Char of Systems and Computer etworks, Wybrzeze Wyspanskego 7, 50-370 Wroclaw, Poland

More information

Affine transformations and convexity

Affine transformations and convexity Affne transformatons and convexty The purpose of ths document s to prove some basc propertes of affne transformatons nvolvng convex sets. Here are a few onlne references for background nformaton: http://math.ucr.edu/

More information

On the correction of the h-index for career length

On the correction of the h-index for career length 1 On the correcton of the h-ndex for career length by L. Egghe Unverstet Hasselt (UHasselt), Campus Depenbeek, Agoralaan, B-3590 Depenbeek, Belgum 1 and Unverstet Antwerpen (UA), IBW, Stadscampus, Venusstraat

More information

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results. Neural Networks : Dervaton compled by Alvn Wan from Professor Jtendra Malk s lecture Ths type of computaton s called deep learnng and s the most popular method for many problems, such as computer vson

More information

Foundations of Arithmetic

Foundations of Arithmetic Foundatons of Arthmetc Notaton We shall denote the sum and product of numbers n the usual notaton as a 2 + a 2 + a 3 + + a = a, a 1 a 2 a 3 a = a The notaton a b means a dvdes b,.e. ac = b where c s an

More information

Bayesian belief networks

Bayesian belief networks CS 1571 Introducton to I Lecture 24 ayesan belef networks los Hauskrecht mlos@cs.ptt.edu 5329 Sennott Square CS 1571 Intro to I dmnstraton Homework assgnment 10 s out and due next week Fnal exam: December

More information

Supplementary Notes for Chapter 9 Mixture Thermodynamics

Supplementary Notes for Chapter 9 Mixture Thermodynamics Supplementary Notes for Chapter 9 Mxture Thermodynamcs Key ponts Nne major topcs of Chapter 9 are revewed below: 1. Notaton and operatonal equatons for mxtures 2. PVTN EOSs for mxtures 3. General effects

More information

Bayesian epistemology II: Arguments for Probabilism

Bayesian epistemology II: Arguments for Probabilism Bayesan epstemology II: Arguments for Probablsm Rchard Pettgrew May 9, 2012 1 The model Represent an agent s credal state at a gven tme t by a credence functon c t : F [0, 1]. where F s the algebra of

More information

NP-Completeness : Proofs

NP-Completeness : Proofs NP-Completeness : Proofs Proof Methods A method to show a decson problem Π NP-complete s as follows. (1) Show Π NP. (2) Choose an NP-complete problem Π. (3) Show Π Π. A method to show an optmzaton problem

More information

Case A. P k = Ni ( 2L i k 1 ) + (# big cells) 10d 2 P k.

Case A. P k = Ni ( 2L i k 1 ) + (# big cells) 10d 2 P k. THE CELLULAR METHOD In ths lecture, we ntroduce the cellular method as an approach to ncdence geometry theorems lke the Szemeréd-Trotter theorem. The method was ntroduced n the paper Combnatoral complexty

More information

Subset Topological Spaces and Kakutani s Theorem

Subset Topological Spaces and Kakutani s Theorem MOD Natural Neutrosophc Subset Topologcal Spaces and Kakutan s Theorem W. B. Vasantha Kandasamy lanthenral K Florentn Smarandache 1 Copyrght 1 by EuropaNova ASBL and the Authors Ths book can be ordered

More information

Linear Regression Analysis: Terminology and Notation

Linear Regression Analysis: Terminology and Notation ECON 35* -- Secton : Basc Concepts of Regresson Analyss (Page ) Lnear Regresson Analyss: Termnology and Notaton Consder the generc verson of the smple (two-varable) lnear regresson model. It s represented

More information

Section 8.3 Polar Form of Complex Numbers

Section 8.3 Polar Form of Complex Numbers 80 Chapter 8 Secton 8 Polar Form of Complex Numbers From prevous classes, you may have encountered magnary numbers the square roots of negatve numbers and, more generally, complex numbers whch are the

More information

Some modelling aspects for the Matlab implementation of MMA

Some modelling aspects for the Matlab implementation of MMA Some modellng aspects for the Matlab mplementaton of MMA Krster Svanberg krlle@math.kth.se Optmzaton and Systems Theory Department of Mathematcs KTH, SE 10044 Stockholm September 2004 1. Consdered optmzaton

More information

x = , so that calculated

x = , so that calculated Stat 4, secton Sngle Factor ANOVA notes by Tm Plachowsk n chapter 8 we conducted hypothess tests n whch we compared a sngle sample s mean or proporton to some hypotheszed value Chapter 9 expanded ths to

More information

Lecture 17 : Stochastic Processes II

Lecture 17 : Stochastic Processes II : Stochastc Processes II 1 Contnuous-tme stochastc process So far we have studed dscrete-tme stochastc processes. We studed the concept of Makov chans and martngales, tme seres analyss, and regresson analyss

More information

Notes on Frequency Estimation in Data Streams

Notes on Frequency Estimation in Data Streams Notes on Frequency Estmaton n Data Streams In (one of) the data streamng model(s), the data s a sequence of arrvals a 1, a 2,..., a m of the form a j = (, v) where s the dentty of the tem and belongs to

More information

APPENDIX A Some Linear Algebra

APPENDIX A Some Linear Algebra APPENDIX A Some Lnear Algebra The collecton of m, n matrces A.1 Matrces a 1,1,..., a 1,n A = a m,1,..., a m,n wth real elements a,j s denoted by R m,n. If n = 1 then A s called a column vector. Smlarly,

More information

Department of Computer Science Artificial Intelligence Research Laboratory. Iowa State University MACHINE LEARNING

Department of Computer Science Artificial Intelligence Research Laboratory. Iowa State University MACHINE LEARNING MACHINE LEANING Vasant Honavar Bonformatcs and Computatonal Bology rogram Center for Computatonal Intellgence, Learnng, & Dscovery Iowa State Unversty honavar@cs.astate.edu www.cs.astate.edu/~honavar/

More information

2.3 Nilpotent endomorphisms

2.3 Nilpotent endomorphisms s a block dagonal matrx, wth A Mat dm U (C) In fact, we can assume that B = B 1 B k, wth B an ordered bass of U, and that A = [f U ] B, where f U : U U s the restrcton of f to U 40 23 Nlpotent endomorphsms

More information

Feature Selection: Part 1

Feature Selection: Part 1 CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?

More information

CHAPTER-5 INFORMATION MEASURE OF FUZZY MATRIX AND FUZZY BINARY RELATION

CHAPTER-5 INFORMATION MEASURE OF FUZZY MATRIX AND FUZZY BINARY RELATION CAPTER- INFORMATION MEASURE OF FUZZY MATRI AN FUZZY BINARY RELATION Introducton The basc concept of the fuzz matr theor s ver smple and can be appled to socal and natural stuatons A branch of fuzz matr

More information

Lecture 12: Discrete Laplacian

Lecture 12: Discrete Laplacian Lecture 12: Dscrete Laplacan Scrbe: Tanye Lu Our goal s to come up wth a dscrete verson of Laplacan operator for trangulated surfaces, so that we can use t n practce to solve related problems We are mostly

More information

COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS

COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS Avalable onlne at http://sck.org J. Math. Comput. Sc. 3 (3), No., 6-3 ISSN: 97-537 COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS

More information

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE Analytcal soluton s usually not possble when exctaton vares arbtrarly wth tme or f the system s nonlnear. Such problems can be solved by numercal tmesteppng

More information

THE SUMMATION NOTATION Ʃ

THE SUMMATION NOTATION Ʃ Sngle Subscrpt otaton THE SUMMATIO OTATIO Ʃ Most of the calculatons we perform n statstcs are repettve operatons on lsts of numbers. For example, we compute the sum of a set of numbers, or the sum of the

More information

Randomness and Computation

Randomness and Computation Randomness and Computaton or, Randomzed Algorthms Mary Cryan School of Informatcs Unversty of Ednburgh RC 208/9) Lecture 0 slde Balls n Bns m balls, n bns, and balls thrown unformly at random nto bns usually

More information

Conjugacy and the Exponential Family

Conjugacy and the Exponential Family CS281B/Stat241B: Advanced Topcs n Learnng & Decson Makng Conjugacy and the Exponental Famly Lecturer: Mchael I. Jordan Scrbes: Bran Mlch 1 Conjugacy In the prevous lecture, we saw conjugate prors for the

More information

Finding Dense Subgraphs in G(n, 1/2)

Finding Dense Subgraphs in G(n, 1/2) Fndng Dense Subgraphs n Gn, 1/ Atsh Das Sarma 1, Amt Deshpande, and Rav Kannan 1 Georga Insttute of Technology,atsh@cc.gatech.edu Mcrosoft Research-Bangalore,amtdesh,annan@mcrosoft.com Abstract. Fndng

More information

Global Sensitivity. Tuesday 20 th February, 2018

Global Sensitivity. Tuesday 20 th February, 2018 Global Senstvty Tuesday 2 th February, 28 ) Local Senstvty Most senstvty analyses [] are based on local estmates of senstvty, typcally by expandng the response n a Taylor seres about some specfc values

More information

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography CSc 6974 and ECSE 6966 Math. Tech. for Vson, Graphcs and Robotcs Lecture 21, Aprl 17, 2006 Estmatng A Plane Homography Overvew We contnue wth a dscusson of the major ssues, usng estmaton of plane projectve

More information

Evaluation for sets of classes

Evaluation for sets of classes Evaluaton for Tet Categorzaton Classfcaton accuracy: usual n ML, the proporton of correct decsons, Not approprate f the populaton rate of the class s low Precson, Recall and F 1 Better measures 21 Evaluaton

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.65/15.070J Fall 013 Lecture 1 10/1/013 Martngale Concentraton Inequaltes and Applcatons Content. 1. Exponental concentraton for martngales wth bounded ncrements.

More information

Lecture 20: Lift and Project, SDP Duality. Today we will study the Lift and Project method. Then we will prove the SDP duality theorem.

Lecture 20: Lift and Project, SDP Duality. Today we will study the Lift and Project method. Then we will prove the SDP duality theorem. prnceton u. sp 02 cos 598B: algorthms and complexty Lecture 20: Lft and Project, SDP Dualty Lecturer: Sanjeev Arora Scrbe:Yury Makarychev Today we wll study the Lft and Project method. Then we wll prove

More information

EPR Paradox and the Physical Meaning of an Experiment in Quantum Mechanics. Vesselin C. Noninski

EPR Paradox and the Physical Meaning of an Experiment in Quantum Mechanics. Vesselin C. Noninski EPR Paradox and the Physcal Meanng of an Experment n Quantum Mechancs Vesseln C Nonnsk vesselnnonnsk@verzonnet Abstract It s shown that there s one purely determnstc outcome when measurement s made on

More information

Module 9. Lecture 6. Duality in Assignment Problems

Module 9. Lecture 6. Duality in Assignment Problems Module 9 1 Lecture 6 Dualty n Assgnment Problems In ths lecture we attempt to answer few other mportant questons posed n earler lecture for (AP) and see how some of them can be explaned through the concept

More information

Vapnik-Chervonenkis theory

Vapnik-Chervonenkis theory Vapnk-Chervonenks theory Rs Kondor June 13, 2008 For the purposes of ths lecture, we restrct ourselves to the bnary supervsed batch learnng settng. We assume that we have an nput space X, and an unknown

More information

= z 20 z n. (k 20) + 4 z k = 4

= z 20 z n. (k 20) + 4 z k = 4 Problem Set #7 solutons 7.2.. (a Fnd the coeffcent of z k n (z + z 5 + z 6 + z 7 + 5, k 20. We use the known seres expanson ( n+l ( z l l z n below: (z + z 5 + z 6 + z 7 + 5 (z 5 ( + z + z 2 + z + 5 5

More information

Chapter 8 Indicator Variables

Chapter 8 Indicator Variables Chapter 8 Indcator Varables In general, e explanatory varables n any regresson analyss are assumed to be quanttatve n nature. For example, e varables lke temperature, dstance, age etc. are quanttatve n

More information

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers Psychology 282 Lecture #24 Outlne Regresson Dagnostcs: Outlers In an earler lecture we studed the statstcal assumptons underlyng the regresson model, ncludng the followng ponts: Formal statement of assumptons.

More information

Linear Approximation with Regularization and Moving Least Squares

Linear Approximation with Regularization and Moving Least Squares Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...

More information

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016 U.C. Berkeley CS94: Spectral Methods and Expanders Handout 8 Luca Trevsan February 7, 06 Lecture 8: Spectral Algorthms Wrap-up In whch we talk about even more generalzatons of Cheeger s nequaltes, and

More information

MODELING TRAFFIC LIGHTS IN INTERSECTION USING PETRI NETS

MODELING TRAFFIC LIGHTS IN INTERSECTION USING PETRI NETS The 3 rd Internatonal Conference on Mathematcs and Statstcs (ICoMS-3) Insttut Pertanan Bogor, Indonesa, 5-6 August 28 MODELING TRAFFIC LIGHTS IN INTERSECTION USING PETRI NETS 1 Deky Adzkya and 2 Subono

More information

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA 4 Analyss of Varance (ANOVA) 5 ANOVA 51 Introducton ANOVA ANOVA s a way to estmate and test the means of multple populatons We wll start wth one-way ANOVA If the populatons ncluded n the study are selected

More information

MMA and GCMMA two methods for nonlinear optimization

MMA and GCMMA two methods for nonlinear optimization MMA and GCMMA two methods for nonlnear optmzaton Krster Svanberg Optmzaton and Systems Theory, KTH, Stockholm, Sweden. krlle@math.kth.se Ths note descrbes the algorthms used n the author s 2007 mplementatons

More information

Singular Value Decomposition: Theory and Applications

Singular Value Decomposition: Theory and Applications Sngular Value Decomposton: Theory and Applcatons Danel Khashab Sprng 2015 Last Update: March 2, 2015 1 Introducton A = UDV where columns of U and V are orthonormal and matrx D s dagonal wth postve real

More information

Note on EM-training of IBM-model 1

Note on EM-training of IBM-model 1 Note on EM-tranng of IBM-model INF58 Language Technologcal Applcatons, Fall The sldes on ths subject (nf58 6.pdf) ncludng the example seem nsuffcent to gve a good grasp of what s gong on. Hence here are

More information

A new construction of 3-separable matrices via an improved decoding of Macula s construction

A new construction of 3-separable matrices via an improved decoding of Macula s construction Dscrete Optmzaton 5 008 700 704 Contents lsts avalable at ScenceDrect Dscrete Optmzaton journal homepage: wwwelsevercom/locate/dsopt A new constructon of 3-separable matrces va an mproved decodng of Macula

More information

Simultaneous Optimization of Berth Allocation, Quay Crane Assignment and Quay Crane Scheduling Problems in Container Terminals

Simultaneous Optimization of Berth Allocation, Quay Crane Assignment and Quay Crane Scheduling Problems in Container Terminals Smultaneous Optmzaton of Berth Allocaton, Quay Crane Assgnment and Quay Crane Schedulng Problems n Contaner Termnals Necat Aras, Yavuz Türkoğulları, Z. Caner Taşkın, Kuban Altınel Abstract In ths work,

More information

Volume 18 Figure 1. Notation 1. Notation 2. Observation 1. Remark 1. Remark 2. Remark 3. Remark 4. Remark 5. Remark 6. Theorem A [2]. Theorem B [2].

Volume 18 Figure 1. Notation 1. Notation 2. Observation 1. Remark 1. Remark 2. Remark 3. Remark 4. Remark 5. Remark 6. Theorem A [2]. Theorem B [2]. Bulletn of Mathematcal Scences and Applcatons Submtted: 016-04-07 ISSN: 78-9634, Vol. 18, pp 1-10 Revsed: 016-09-08 do:10.1805/www.scpress.com/bmsa.18.1 Accepted: 016-10-13 017 ScPress Ltd., Swtzerland

More information

Thermodynamics and statistical mechanics in materials modelling II

Thermodynamics and statistical mechanics in materials modelling II Course MP3 Lecture 8/11/006 (JAE) Course MP3 Lecture 8/11/006 Thermodynamcs and statstcal mechancs n materals modellng II A bref résumé of the physcal concepts used n materals modellng Dr James Ellott.1

More information

Speech and Language Processing

Speech and Language Processing Speech and Language rocessng Lecture 3 ayesan network and ayesan nference Informaton and ommuncatons Engneerng ourse Takahro Shnozak 08//5 Lecture lan (Shnozak s part) I gves the frst 6 lectures about

More information

Learning from Data 1 Naive Bayes

Learning from Data 1 Naive Bayes Learnng from Data 1 Nave Bayes Davd Barber dbarber@anc.ed.ac.uk course page : http://anc.ed.ac.uk/ dbarber/lfd1/lfd1.html c Davd Barber 2001, 2002 1 Learnng from Data 1 : c Davd Barber 2001,2002 2 1 Why

More information

Maximizing the number of nonnegative subsets

Maximizing the number of nonnegative subsets Maxmzng the number of nonnegatve subsets Noga Alon Hao Huang December 1, 213 Abstract Gven a set of n real numbers, f the sum of elements of every subset of sze larger than k s negatve, what s the maxmum

More information

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009 College of Computer & Informaton Scence Fall 2009 Northeastern Unversty 20 October 2009 CS7880: Algorthmc Power Tools Scrbe: Jan Wen and Laura Poplawsk Lecture Outlne: Prmal-dual schema Network Desgn:

More information

Engineering Risk Benefit Analysis

Engineering Risk Benefit Analysis Engneerng Rsk Beneft Analyss.55, 2.943, 3.577, 6.938, 0.86, 3.62, 6.862, 22.82, ESD.72, ESD.72 RPRA 2. Elements of Probablty Theory George E. Apostolaks Massachusetts Insttute of Technology Sprng 2007

More information

Complete subgraphs in multipartite graphs

Complete subgraphs in multipartite graphs Complete subgraphs n multpartte graphs FLORIAN PFENDER Unverstät Rostock, Insttut für Mathematk D-18057 Rostock, Germany Floran.Pfender@un-rostock.de Abstract Turán s Theorem states that every graph G

More information

princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 3: Large deviations bounds and applications Lecturer: Sanjeev Arora

princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 3: Large deviations bounds and applications Lecturer: Sanjeev Arora prnceton unv. F 13 cos 521: Advanced Algorthm Desgn Lecture 3: Large devatons bounds and applcatons Lecturer: Sanjeev Arora Scrbe: Today s topc s devaton bounds: what s the probablty that a random varable

More information

a b a In case b 0, a being divisible by b is the same as to say that

a b a In case b 0, a being divisible by b is the same as to say that Secton 6.2 Dvsblty among the ntegers An nteger a ε s dvsble by b ε f there s an nteger c ε such that a = bc. Note that s dvsble by any nteger b, snce = b. On the other hand, a s dvsble by only f a = :

More information

Hidden Markov Models

Hidden Markov Models Hdden Markov Models Namrata Vaswan, Iowa State Unversty Aprl 24, 204 Hdden Markov Model Defntons and Examples Defntons:. A hdden Markov model (HMM) refers to a set of hdden states X 0, X,..., X t,...,

More information

} Often, when learning, we deal with uncertainty:

} Often, when learning, we deal with uncertainty: Uncertanty and Learnng } Often, when learnng, we deal wth uncertanty: } Incomplete data sets, wth mssng nformaton } Nosy data sets, wth unrelable nformaton } Stochastcty: causes and effects related non-determnstcally

More information

Appendix B. The Finite Difference Scheme

Appendix B. The Finite Difference Scheme 140 APPENDIXES Appendx B. The Fnte Dfference Scheme In ths appendx we present numercal technques whch are used to approxmate solutons of system 3.1 3.3. A comprehensve treatment of theoretcal and mplementaton

More information

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg prnceton unv. F 17 cos 521: Advanced Algorthm Desgn Lecture 7: LP Dualty Lecturer: Matt Wenberg Scrbe: LP Dualty s an extremely useful tool for analyzng structural propertes of lnear programs. Whle there

More information

Formalisms For Fusion Belief in Design

Formalisms For Fusion Belief in Design XII ADM Internatonal Conference - Grand Hotel - Rmn Italy - Sept. 5 th -7 th, 200 Formalsms For Fuson Belef n Desgn Mchele Pappalardo DIMEC-Department of Mechancal Engneerng Unversty of Salerno, Italy

More information

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information