Operations for Inference in Continuous Bayesian Networks with Linear Deterministic Variables

Size: px
Start display at page:

Download "Operations for Inference in Continuous Bayesian Networks with Linear Deterministic Variables"

Transcription

1 Appeared in: International Journal of Approximate Reasoning, Vol. 42, Nos. 1--2, 2006, pp Operations for Inference in Continuous Bayesian Networks with Linear Deterministic Variables Barry R. Cobb a, Prakash P. Shenoy b a Department of Economics and Business, Virginia Military Institute, Lexington, Virginia, 24450, U.S.A b School of Business, University of Kansas, 1300 Sunnyside Ave., Summerfield Hall, Lawrence, Kansas, , U.S.A Abstract An important class of continuous Bayesian networks are those that have linear conditionally deterministic variables (a variable that is a linear deterministic function of its parents). In this case, the joint density function for the variables in the network does not exist. Conditional linear Gaussian (CLG) distributions can handle such cases when all variables are normally distributed. In this paper, we develop operations required for performing inference with linear conditionally deterministic variables in continuous Bayesian networks using relationships derived from joint cumulative distribution functions (CDF s). These methods allow inference in networks with linear deterministic variables and non-gaussian distributions. Key words: Bayesian networks, conditional linear Gaussian models, deterministic variables PACS: 1 Introduction Bayesian networks model knowledge about propositions in uncertain domains using graphical and numerical representations. At the qualitative level, a Bayesian network is a directed acyclic graph where nodes represent variables and the (missing) edges represent conditional independence relations among Corresponding author. addresses: cobbbr@vmi.edu (Barry R. Cobb), pshenoy@ku.edu (Prakash P. Shenoy). Preprint submitted to Elsevier Science 3 November 2005

2 the variables. At the numerical level, a Bayesian network consists of a factorization of a joint probability distribution into a set of conditional distributions, one for each variable in the network. Continuous Bayesian networks contain variables whose state spaces are uncountable. A commonly used type of Bayesian network which accomodates continuous variables is the conditional linear Gaussian (CLG) model [5,7]. In CLG models, the distribution of a continuous variable is a linear Gaussian function of its continuous parents. The scheme originally developed by Lauritzen [7] allowed exact computation of means and variances in CLG networks when the conditional distribution of a variable given its continuous parents has a positive variance; however, this algorithm did not always compute the exact marginal densities of continuous variables. A new computational scheme for CLG models was developed by Lauritzen and Jensen [8]. To find full local marginals, this scheme places some restrictions on the construction and initialization of junction trees. An important class of continuous Bayesian networks are those that have linear conditionally deterministic variables (a variable that is a deterministic function of its parents). In this case, the joint density function for the variables in the network does not exist. CLG models can handle such cases when all variables are normally distributed. However, for models where continuous variables are not normally distributed, methods for carrying out exact inference in networks with linear deterministic relationships have not been developed. Exact inference in hybrid Bayesian networks can be performed using mixtures of truncated exponentials (MTE) potentials [9,12]. General formulations of MTE potentials which approximate the normal probability density function (PDF) exist [1]; however, these formulations cannot be used to model a conditional distribution where the variance of a variable given values of its continuous parents is zero. In this paper, we develop inference operations for linear conditionally deterministic variables using relationships derived from joint cumulative distribution functions (CDF s). These operations allow MTE potentials to be used for inference in any continuous CLG model, as well as other models that have linear conditionally deterministic variables which are non-gaussian. The remainder of this paper is organized as follows. Section 2 introduces notation and definitions used throughout the paper. Section 3 introduces techniques for using CDF s to construct PDF s for deterministic variables. Section 4 introduces join tree operations for linear deterministic variables. Section 5 contains an example of inference in a continuous Bayesian network containing linear deterministic variables. Section 6 summarizes and states directions for future research. 2

3 X Y Fig. 1. Graphical representation of the conditionally deterministic relationship of Y given X determined by the CMF p Y x. 2 Notation and Definitions This section contains notation and definitions that will be used throughout the remainder of the paper. 2.1 Notation Random variables in a Bayesian network will be denoted by capital letters, e.g., A, B, C. Sets of variables will be denoted by boldface capital letters, e.g., X. All variables in this paper are assumed to take values in uncountable (continuous) state spaces. If X is a set of variables, x is a configuration of specific states of those variables. The continuous state space of X is denoted by Ω X. MTE probability potentials are denoted by lower-case greek letters, e.g., α, β, γ. In graphical representations, continuous nodes in Bayesian networks are represented by double-border ovals. Variables that are deterministic functions of their parents are represented by triple-border ovals. Shaded nodes are degenerate, indicating that evidence has restricted the variable to one value. 2.2 Conditional Mass Function (CMF) When relationships between continuous variables are deterministic, the joint PDF does not exist. If Y is a deterministic relationship of variables in X, i.e. Y = g(x), the conditional mass function (CMF) for {Y x} is defined as p Y x = 1{y = g(x)}, (1) where 1{A} is the indicator function of the event A, i.e. 1{A}(B) =1ifB = A and 0 otherwise. Graphically, the conditionally deterministic relationship of Y given X is represented in a Bayesian network model as shown in Figure 1, where X consists of a single continuous variable X. 3

4 2.3 Mixtures of Truncated Exponentials A mixture of truncated exponentials (MTE) potential [9,12] has the following definition. MTE potential. LetX =(X 1,...,X n )beann-dimensional random variable. A function φ :Ω X R + is an MTE potential if one of the next two conditions holds: (1) The potential φ can be written as m φ(x) =a 0 + a i exp{ n i=1 j=1 b (j) i x j } (2) for all x Ω X,wherea i,i=0,...,m and b (j) i, i =1,...,m, j =1,...,n are real numbers. (2) The domain of the variables, Ω X, is partitioned into hypercubes {Ω 1 X,...,Ω k X} such that φ is defined as φ(x) =φ i (x) if x Ω Xi, i =1,...,k, (3) where each φ i,i=1,..., k can be written in the form of equation (2). In the definition above, k is the number of pieces and m is the number of exponential terms in each piece of the MTE potential. In this paper, all MTE potentials are equal to zero in unspecified regions. Moral et al. [10] proposes an iterative algorithm based on least squares approximation to estimate MTE potentials from data. Moral et al. [11] describes a method to approximate conditional MTE potentials using a mixed tree structure. Cobb et al. [4] describes a nonlinear optimization procedure used to fit MTE parameters for approximations to standard PDF s, including the uniform, exponential, gamma, beta, and lognormal distributions. Inference in continuous Bayesian networks where all conditional probability distributions are approximated by MTE potentials is performed using the operations of restriction, combination, and marginalization, as defined by Moral et al. [9] and further described by Cobb and Shenoy [1]. Restriction involves substituting real numbers representing evidence into a potential. Combination of two MTE potentials is pointwise multiplication. If two MTE potentials for X are denoted by φ and ψ, the combination of these two potentials is denoted by φ ψ. Marginalization of a variable from an MTE potential is closed-form integration. If an MTE potential for X is denoted by φ, the marginalization of a variable X X from φ is denoted by φ X. Operations used to marginalize a variable from the combination of an MTE 4

5 potential and a CMF are defined later in the paper. 3 Using CDF s to Construct PDF s for Deterministic Variables This section contains standard results from probability theory which describe methods of constructing CDF s and their corresponding PDF s for variables that are deterministic functions of their parents. These results are re-stated here for completeness. 3.1 Monotonically Increasing Functions Consider a random variable Y which is a monotonically increasing deterministic function of a random variable X. A Bayesian network representing this relationship is shown in Figure 1. The joint CDF for {X, Y } represents the following probability: F X,Y (x, y)=p [X x, Y y] = P [X x]p [Y y X x] = F X (x)p [Y y X x] for any x Ω X such that F X (x) > 0. When Y is a monotonically increasing function of X, X = g 1 (Y )andp [Y y X x] =1,thusF X,Y (x, y) =F X (g 1 (y)). Allowing x go to infinity in both sides of this expression gives F X,Y (,y)orf Y (y) = F X (g 1 (y)). Differentiating both sides of this expression with respect to y (using the chain rule on the right-hand side) yields f Y (y) =f X (g 1 (y)) d dy (g 1 (y)). (4) Thus, the Bayesian network where X = g 1 (Y )andf Y (y) meets the above condition will have the same CDF of the original Bayesian network and the Bayesian networks are equivalent, as stated in the following proposition. Proposition 1. Suppose we have a Bayesian network with two variables X and Y with an arrow from X to Y where Y is a conditionally deterministic, monotonically increasing function of X. Then, the equivalent Bayesian network with an arrow from Y to X where X is a conditionally deterministic 5

6 fx ( x) X Before Arc Reversal 1{ y= g( x)} Y 1 1{ x= g ( y)} After Arc Reversal d f y g y f g y dy 1 1 Y( ) = ( ) X( ( )) X Y Fig. 2. Graphical representation of the conditionally deterministic relationship of X on Y before and after performing an arc reversal on the Bayesian network of Figure 1. function of Y meets the conditions that f Y (y) =f X (g 1 (y)) d dy (g 1 (y)) and X = g 1 (Y ). When Y is a monotonically increasing (and therefore invertible) deterministic function of X, Proposition 1 gives a shortcut to finding the PDF of Y from the PDF of X that does not require the CDF of Y to be computed. We refer to using the operation in Proposition 1 as performing an arc reversal on the Bayesian network. After the operation is performed, the Bayesian network appears as in Figure 2. Example 1. Suppose that a random variable X has PDF 3x 2 if 0 <x<1 f X (x) = 0 elsewhere, and we want to find f Y (y) ify = g(x) =4X 2. Note that f X (g 1 (y)) = 3y/4 and d dy (g 1 (y)) = 1/(4 y). Using Proposition 1, we compute 3y f Y (y) = = 3 y if 0 <y<4 y 16 0 elsewhere. 6

7 3.2 Monotonically Decreasing Functions Consider a random variable Y which is a monotonically decreasing function of a random variable X. A Bayesian network representing this relationship is shown in Figure 1. The joint CDF for {X, Y } represents the following probability: F X,Y (x, y)=p [X x, Y y] = P [X x] P [x g 1 (y)] = F X (x) F X (g 1 (y)). for any x Ω X such that F X (x) > 0. When Y is a monotonically decreasing function of X, X = g 1 (Y ), thus F X,Y (x, y) =F X (x) F X (g 1 (y)). Allowing x go to infinity in both sides of the last line of the expression above gives F X,Y (,y)orf Y (y) =1 F X (g 1 (y)). Differentiating both sides of this expression with respect to y (using the chain rule on the right-hand side) yields f Y (y) = f X (g 1 (y)) d dy (g 1 (y)). (5) Thus, the Bayesian network where X = g 1 (Y )andf Y (y) meets the above condition will have the same CDF of the original Bayesian network and the Bayesian networks are equivalent, as stated in the following proposition. Proposition 2. Suppose we have a Bayesian network with two variables X and Y with an arrow from X to Y where Y is a conditionally deterministic, monotonically decreasing function of X. Then, the equivalent Bayesian network with an arrow from Y to X where X is a conditionally deterministic function of Y meets the conditions that f Y (y) = f X (g 1 (y)) d dy (g 1 (y)) and X = g 1 (Y ). When Y is a monotonically decreasing (and therefore invertible) deterministic function of X, Proposition 2 gives a shortcut to finding the PDF of Y from the PDF of X that does not require the CDF of Y to be computed. As in the monotonically increasing case, we refer to use of the operation in Proposition 2 as an arc reversal. Example 2. Let X have the uniform PDF over the unit interval, i.e. X U(0, 1). Find f Y (y) ify = g(x) = ln X λ. 7

8 Note that f X (g 1 (y)) = 1 and d dy (g 1 (y)) = λe λy. Using Proposition 2, we compute ( 1) λe λy = λe λy if 0 <y< f Y (y) = 0 elsewhere, 3.3 Linear CDF Marginalization Operator Suppose Y is a conditionally deterministic linear function of X, i.e. Y = g(x) =ax + b, a 0. The following operation will be used to determine the marginal PDF for Y : f Y (y) =(f X p Y x ) X (y) = 1 a f X ( ) y b. (6) a In this operation, X is marginalized from the combination of the PDF for X and the CMF for Y given X. The definition of the Linear CDF Marginalization Operator follows directly from the expressions in Propositions 1 and 2. Example 3. Suppose that a random variable X has PDF 6x(1 x) if0<x<1 f X (x) = 0 elsewhere. Find f Y (y) if the deterministic relationship Y = g(x) =2X +1isrepresented by the CMF p Y x = 1{y =2x +1}. Note that f X ( y b )=f a X( y 1 2 )=6(y 1 y 1 ) (1 ( )) = y2 +6y 9. 2 Using the operation in (6), we find the PDF for Y as 3 f Y (y) =(f X p Y x ) X 4 (y) = y2 +3y 9 if 1 <y<3 4 0 elsewhere. The following theorem is required for inference using MTE potentials in Bayesian networks with linear conditionally deterministic variables. Theorem 3. Ifφ 1 (x) isanmtepotentialforx and Y is a conditionally deterministic linear function of X represented by the CMF p Y x,then φ 2 (y) =(φ 1 p Y x ) X (y) is an MTE potential. 8

9 f ( x ) f ( y) X Yx X Y Z 1{ z= g( x, y)} Fig. 3. Graphical representation of a Bayesian network where Z is a deterministic function of X andy. Proof. Multiplication by 1/a or 1/a is multiplication by a constant. MTE potentials are closed under multiplication by constants (the constants a 0 and a i, i =1,..., m in (2) are revised). Exponential terms of the form exp{x} in φ 1 (x) are revised to be of the form exp{ 1 y b } in φ a a 2(y). Since exp{ 1 y a b } = exp{ 1y} exp{ b } and exp{ b } is a constant, the result is an a a a a MTE potential of the form in (2). 3.4 Method of Convolutions Let us consider a case where a conditionally deterministic variable has more than one parent. Let Z be a deterministic function of random variables X and Y where X has PDF f X (x) and{y x} has density f Y x (y). A Bayesian network representation of this case is shown in Figure 3. Suppose Z = g(x, Y ) is invertible in Y. Then by arguments similar to those used in Propositions 1 and 2, we can show that the Bayesian network in Figure 3 is equivalent to the Bayesian network in Figure 4 where X has PDF f X (x), {Z x} has density f Z x (z) = z g 1 (x, z) f ( Y x g 1 (x, z) ), and Y = g 1 (X, Z) is a conditionally deterministic function of X and Z. We can consider the Bayesian network in Figure 4 as being the network obtained from the Bayesian network in Figure 3 by reversing the arc (Y,Z). We 9

10 Fig. 4. Graphical representation of the Bayesian network in Figure 3 after reversal of the arc (Y, Z). can now compute the marginal density of Z as follows: f Z (z) = = ( f X (x) f Z x (z) dx f X (x) z g 1 (x, z) f ( Y x g 1 (x, z) )) dx. (7) The formula in (7) is called the method of convolutions in probability theory. The following theorem will be required for join tree operations when a variable is a linear conditionally deterministic function of its parents. Proposition 4. LetX and Y be continuous, possibly dependent random variables with joint PDF f X,Y and let Z = a 1 X + a 2 Y + b, a 2 0.The un-normalized joint PDF for {X, Z} can be found as f X,Z (x, z) f X,Y ( x, z a 1 x b a 2 ). Proof. Follows directly from the method of convolutions in probability theory. We can replace X and a 1 in Proposition 4 with a vector of variables and a vector of non-zero constants, respectively, and the result holds. A transformation of the form in Proposition 4 is referred to as a convolution of the function f X,Y (x, y) [6]. To use the convolution formula in Proposition 4 to find PDF s for linear conditionally deterministic variables in hybrid Bayesian networks with MTE potentials, the following theorem is required. 10

11 φ, X p Xi X' X\X k X i Fig. 5. A join tree for a Bayesian network with a deterministic variable. Theorem 5. Ifφ 1 is a joint MTE potential for {X, Y } and Z = a 1 X + a 2 Y + b, a 2 0, the un-normalized joint PDF φ 2 for {X, Z} calculated from the convolution of φ 1 is an MTE potential. Proof. Follows directly from the proof of Theorem 3. 4 Join Tree Operations with Linearly Deterministic Variables Suppose we have a node in a join tree for a continuous Bayesian network containing a set of variables X =(X 1,..., X N ). Assume a variable X i X is a linear deterministic function of the remaining variables X = X \ X i, i.e. X i = g(x 1,...,X i 1,X i+1,...,x N )=W + b, where W = a 1 X a i 1 X i 1 + a i+1 X i a N X N with a 1,..., a i 1,a i+1,..., a N and b defined as real numbers, with at least one of the slope coefficients in the linear equation not equal to zero. The joint PDF of X is denoted by φ. The joint PDF for X does not exist; however, we can find the marginal PDF for X i by using the operations defined in Section 3. Consider the join tree in Figure 5. The message passed from {X} to {X \ X k } (where X k X and X k X i ) is calculated using Proposition 4 as ψ(x,x i )= ( ) Xk φ p Xi \X (x,x i )=φ x i N j=1 j/ {i,k} a j x j /a j, where x =(x,x i,x k ). The message from {X} to {X\X k } to {X i } is calculated as ϕ(x i )= ψ(x,x i ) dx. Ω X 11

12 X Y Z Fig. 6. The Bayesian network for Example 5. f X p Z {x,y} X {X,Y} {X,Y,Z} Z Y f Y Fig. 7. The join tree for Example 5. The potential ϕ is normalized to find the posterior marginal for X i.ifφ is initially an MTE potential, ϕ is an MTE potential because the first message results in an MTE potential according to Theorem 5 and the second message results in an MTE potential because the class of MTE potentials is closed under marginalization. The next example utilizes the MTE approximation to the normal PDF for join tree operations with a deterministic variable in order to compare answers to Hugin software. Example 5. Consider the Bayesian network depicted in Figure 6. Suppose X N(0, 1), Y N(1, 1), and Z is a conditionally deterministic function of its parents, Z x, y N(2 + x y, 0). We can calculate the marginal distribution of Z by passing messages in the join tree shown in Figure 7. The PDF s for X and Y, denoted by f X and f Y, respectively, are combined to form the joint PDF for {X, Y } and sent to {X, Y, Z} in the join tree. We next calculate the PDF for Z = X Y + 2 using Proposition 4 as follows: f Z (z) = f X,Y (x, x z +2)dx. Note that in this case, since X and Y are independent, f X,Y (x, y)=f X (x)f Y (y). 12

13 Fig. 8. The marginal PDF for Z in Example 5. Thus, if f X and f Y are maintained as decomposed potentials in the message from {X, Y } to {X, Y, Z} the calculation above can be simplified to f Z (z) = f X (x)f Y (x z +2)dx. The marginal PDF for Z (shown in Figure 8) was created by approximating the normal PDF s in this example with the MTE approximation to the normal PDF presented in Cobb and Shenoy [1]. The expected value and variance of this marginal PDF are and These answers are comparable with exact results obtained using Hugin software, which gives an expected value and variance of and , respectively. Suppose we obtain evidence that Z = 3 and pass this evidence as a message from {Z} to {X, Y, Z}. Since the existing potential for Z states that Z =2+X Y, the evidence dictates the new deterministic relationship X = Y + 1, which is expressed as the CMF p X y and sent from {X, Y, Z} to {X, Y } in the join tree. The variables X and Y are no longer independent and now have a linear conditionally deterministic relationship. The revised Bayesian network is depicted in Figure 9. To calculate the revised marginal distribution for X we combine f X (x) with the distribution created by applying the linear CDF marginalization operator in (6) to the prior distribution for Y (the latter is the message from {X, Y } to {X}) as follows: f Xev (x) =K f X (x) (f Y p X y ) Y (x) =K f X (x) f Y (x 1). In this calculation, K is a normalization constant. The expected value and variance of the posterior marginal PDF for X are calculated as and , respectively. These answers are comparable with exact results obtained using Hugin software, which gives an expected value and variance of and , respectively. To calculate the revised marginal distribution for Y we combine the prior distribution for Y with the distribution created by applying the linear CDF 13

14 X Y Z Fig. 9. The revised Bayesian network for Example 5 after observing evidence on Z. marginalization operator in (6) to the prior distribution for X (the latter is the message from {X, Y } to {Y }) as follows: f Yev (y) =K f Y (y) (f X p Y x ) X (y) =K f Y (y) f X (y +1). In this calculation, K is a normalization constant. Propositions 1 and 2 allow us to use either p X y or p Y x as equivalent expressions of the deterministic relationship between Y and X. The expected value and variance of the posterior marginal PDF for Y are calculated as and , respectively. These answers are comparable with exact results obtained using Hugin software, which gives an expected value and variance of and , respectively. 5 Example The Bayesian network in this example (shown in Figure 10) contains one variable (A) which follows a beta distribution, one variable (C) with a Gaussian potential, and one variable (B) which is a linear conditionally deterministic function of its parent. All probability potentials are approximated in the calculations by MTE potentials. 5.1 Representation The probability distribution for A is a beta distribution with parameters α = 2.7 andβ =1.3, i.e. (A) Beta(2.7, 1.3). The PDF for A is approximated 14

15 A B C Fig. 10. The Bayesian network for the example problem. (using the methods described in [4]) by an MTE potential as follows: α(a) =P (A) = exp{ a} exp{ a} if 0 <a<d exp{ a} exp{ a} if d a<m (5.26E 12) exp{ a} exp{ a} if m a<1 0 elsewhere. where m =(1 α)/(2 α β) =0.85 and d = (α 1)(α+β 3) (β 1)(α 1)(α+β 3) (α+β 3)(α+β 2) = The MTE potential for A is shown graphically in Figure 11, overlayed on the actual Beta(2.7, 1.3) distribution. The probability distribution for B is defined as (B a) N(2a +1, 0). The conditional distribution for B is represented by a CMF as follows: β(a, b)=p B a (a, b) =1{b =2a +1}(a, b). The probability distribution for C is defined as (C b) N(2b +1, 1). This distribution is modeled with the MTE approximation to the normal PDF (denoted by δ) from[1]. 15

16 Fig. 11. The MTE potential for A overlayed on the actual Beta(2.7, 1.3) distribution. α β δ A {A,B} B {B,C} C Fig. 12. The join tree for the example problem. 5.2 Computing Messages e C The join tree for the example problem is shown in Figure 12. The messages required to calculate prior marginals for each variable in the network without evidence are as follows: 1) α from {A} to {A, B} 2) (α β) A from {A, B} to {B} and {B} to {B,C} 3) ((α β) A δ) B from {B,C} to {C} 5.3 Prior Marginals The prior marginal distribution for B is the message sent from {A, B} to {B,C}. The expected value and variance of this distribution are calculated as and , respectively. The prior marginal distribution for C is the message sent from {B,C} to {C}. The expected value and variance of 16

17 Fig. 13. The prior marginal distributions for B (left) and C (right) Fig. 14. The posterior marginal distribution for B considering the evidence C =6. this distribution are calculated as and , respectively. The prior marginal distributions for B and C are shown graphically in Figure Entering Evidence Assume evidence exists that C = 6 and define e C = 6. Define η =(α β) A and ϑ(a, b) =p A b (a, b) =1{a =0.5b 0.5}(a, b) as the potentials resulting from the reversal of the arc between A and B. The evidence e C = 6 is passed from {C} to {B,C} in the join tree, where the existing potential is restricted to δ(b, 6). This likelihood potential is passed from {B,C} to {B} in the join tree. Denote the unnormalized posterior marginal distribution for B as ξ (b) =η(b) δ(b, 6). The normalization constant is calculated as K = b (η(b) δ(b, 6)) db = Thus, the normalized marginal distribution for B is found as ξ(b) = K 1 ξ (b). The expected value and variance of this distribution (which is displayed in Figure 14) are calculated as and , respectively. 17

18 Fig. 15. The posterior marginal distribution for A considering the evidence (c =6). Using the results of Proposition 1, we determine the posterior marginal distribution for A. Define θ =(ξ ν) B as: θ(a) = 1 ξ (2a +1). 0.5 The CMF ν(a, b) =p B a (a, b) =1{b =2a +1}(a, b) is obtained by reversing the arc between A and B in Figure 10. The expected value and variance of this distribution are calculated as and , respectively. The posterior marginal distribution for A considering the evidence is shown graphically in Figure Summary and Conclusions This paper has described operations required for inference in continuous Bayesian networks containing variables that are linear conditionally deterministic functions of their parents. Since the joint PDF for a network with deterministic variables does not exist, the operations presented are derived from the method of convolutions in probability theory. Similar operations to those presented in this paper are incorporated in an inference algorithm for hybrid Bayesian networks (containing discrete and continuous variables) in [3]. This algorithm requires a mixed potential representation to accommodate mixed distributions. Nonlinear deterministic relationships can be accommodated in continuous Bayesian networks by extending the operations in this paper to piecewise linear functions which approximate nonlinear functions, as shown in [2]. 18

19 Acknowledgments This research was completed while the first author was a doctoral student at the University of Kansas and was supported by a research assistantship from the School of Business. We thank Thomas D. Nielsen, Antonio Salmerón, participants of the PGM 2004 conference, and two anonymous referees for valuable comments, suggestions, and discussions. References [1] B.R. Cobb, P.P. Shenoy, Inference in hybrid Bayesian networks with mixtures of truncated exponentials, Int. J. Approx. Reason. (2005) in press. Available electronically at <doi: /j.ijar >. [2] B.R. Cobb, P.P. Shenoy, Nonlinear deterministic relationships in Bayesian networks, in: L. Godo (Ed.), Symbolic and Quantitative Approaches to Reasoning with Uncertainty, Lecture Notes in Artificial Intelligence, Vol. 3571, Springer-Verlag, Heidelberg, 2005, pp [3] B.R. Cobb, P.P. Shenoy, Hybrid Bayesian networks with linear deterministic variables, in: F. Bacchus, T. Jaakkola (Eds.), Uncertainty in Artificial Intelligence, Vol. 21, AUAI Press, Corvallis, OR, 2005, pp [4] B.R. Cobb, R. Rumí, P.P. Shenoy, Approximating probability density functions with mixtures of truncated exponentials, Proceedings of the Tenth Conference on Information Processing and Management of Uncertainty in Knowledge- Based Systems (IPMU-04), 2004, pp Available for download at: pshenoy/papers/ipmu04.pdf [5] R.G. Cowell, A.P. Dawid, S.L. Lauritzen, D.J. Spiegelhalter, Probabilistic Networks and Expert Systems, Springer, New York, [6] R.J. Larsen, M.L. Marx, An Introduction to Mathematical Statistics and its Applications, Prentice Hall, Upper Saddle River, N.J., [7] S.L. Lauritzen, Propagation of probabilities, means and variances in mixed graphical association models, J. Amer. Stat. Assoc. 87 (1992) [8] S.L. Lauritzen, F. Jensen, Stable local computation with conditional Gaussian distributions, Stat. Comput. 11 (2001) [9] S. Moral, R. Rumí, A. Salmerón, Mixtures of truncated exponentials in hybrid Bayesian networks, in: P. Besnard, S. Benferhart (Eds.), Symbolic and Quantitative Approaches to Reasoning under Uncertainty, Lecture Notes in Artificial Intelligence, Vol. 2143, Springer-Verlag, Heidelberg, 2001, pp

20 [10] S. Moral, R. Rumí, A. Salmerón, Estimating mixtures of truncated exponentials from data, in: J.A. Gamez, A. Salmerón (Eds.), Proceedings of the First European Workshop on Probabilistic Graphical Models (PGM 02), Cuenca, Spain, 2002, pp [11] S. Moral, R. Rumí, A. Salmerón, Approximating conditional MTE distributions by means of mixed trees, in: T.D. Nielsen, N.L. Zhang (Eds.), Symbolic and Quantitative Approaches to Reasoning under Uncertainty, Lecture Notes in Artificial Intelligence, Vol. 2711, Springer-Verlag, Heidelberg, 2003, pp [12] R. Rumí, Modelos de Redes Bayesianas con Variables Discretas y Continuas, Doctoral Thesis, Universidad de Almería, Departamento de Estadística y Matemática Aplicada, Almería, Spain,

Inference in Hybrid Bayesian Networks with Mixtures of Truncated Exponentials

Inference in Hybrid Bayesian Networks with Mixtures of Truncated Exponentials In J. Vejnarova (ed.), Proceedings of 6th Workshop on Uncertainty Processing (WUPES-2003), 47--63, VSE-Oeconomica Publishers. Inference in Hybrid Bayesian Networks with Mixtures of Truncated Exponentials

More information

Inference in Hybrid Bayesian Networks with Mixtures of Truncated Exponentials

Inference in Hybrid Bayesian Networks with Mixtures of Truncated Exponentials Appeared in: International Journal of Approximate Reasoning, 41(3), 257--286. Inference in Hybrid Bayesian Networks with Mixtures of Truncated Exponentials Barry R. Cobb a, Prakash P. Shenoy b a Department

More information

Tractable Inference in Hybrid Bayesian Networks with Deterministic Conditionals using Re-approximations

Tractable Inference in Hybrid Bayesian Networks with Deterministic Conditionals using Re-approximations Tractable Inference in Hybrid Bayesian Networks with Deterministic Conditionals using Re-approximations Rafael Rumí, Antonio Salmerón Department of Statistics and Applied Mathematics University of Almería,

More information

Inference in Hybrid Bayesian Networks Using Mixtures of Gaussians

Inference in Hybrid Bayesian Networks Using Mixtures of Gaussians 428 SHENOY UAI 2006 Inference in Hybrid Bayesian Networks Using Mixtures of Gaussians Prakash P. Shenoy University of Kansas School of Business 1300 Sunnyside Ave, Summerfield Hall Lawrence, KS 66045-7585

More information

Decision Making with Hybrid Influence Diagrams Using Mixtures of Truncated Exponentials

Decision Making with Hybrid Influence Diagrams Using Mixtures of Truncated Exponentials Appeared in: European Journal of Operational Research, Vol. 186, No. 1, April 1 2008, pp. 261-275. Decision Making with Hybrid Influence Diagrams Using Mixtures of Truncated Exponentials Barry R. Cobb

More information

Piecewise Linear Approximations of Nonlinear Deterministic Conditionals in Continuous Bayesian Networks

Piecewise Linear Approximations of Nonlinear Deterministic Conditionals in Continuous Bayesian Networks Piecewise Linear Approximations of Nonlinear Deterministic Conditionals in Continuous Bayesian Networks Barry R. Cobb Virginia Military Institute Lexington, Virginia, USA cobbbr@vmi.edu Abstract Prakash

More information

Some Practical Issues in Inference in Hybrid Bayesian Networks with Deterministic Conditionals

Some Practical Issues in Inference in Hybrid Bayesian Networks with Deterministic Conditionals Some Practical Issues in Inference in Hybrid Bayesian Networks with Deterministic Conditionals Prakash P. Shenoy School of Business University of Kansas Lawrence, KS 66045-7601 USA pshenoy@ku.edu Rafael

More information

Appeared in: International Journal of Approximate Reasoning, Vol. 52, No. 6, September 2011, pp SCHOOL OF BUSINESS WORKING PAPER NO.

Appeared in: International Journal of Approximate Reasoning, Vol. 52, No. 6, September 2011, pp SCHOOL OF BUSINESS WORKING PAPER NO. Appeared in: International Journal of Approximate Reasoning, Vol. 52, No. 6, September 2011, pp. 805--818. SCHOOL OF BUSINESS WORKING PAPER NO. 318 Extended Shenoy-Shafer Architecture for Inference in

More information

Inference in Hybrid Bayesian Networks with Deterministic Variables

Inference in Hybrid Bayesian Networks with Deterministic Variables Inference in Hybrid Bayesian Networks with Deterministic Variables Prakash P. Shenoy and James C. West University of Kansas School of Business, 1300 Sunnyside Ave., Summerfield Hall, Lawrence, KS 66045-7585

More information

Arc Reversals in Hybrid Bayesian Networks with Deterministic Variables. Abstract

Arc Reversals in Hybrid Bayesian Networks with Deterministic Variables.  Abstract Appeared as: E. N. Cinicioglu and P. P. Shenoy, "Arc reversals in hybrid Bayesian networks with deterministic variables," International Journal of Approximate Reasoning, 50(5), 2009, 763 777. Arc Reversals

More information

Mixtures of Polynomials in Hybrid Bayesian Networks with Deterministic Variables

Mixtures of Polynomials in Hybrid Bayesian Networks with Deterministic Variables Appeared in: J. Vejnarova and T. Kroupa (eds.), Proceedings of the 8th Workshop on Uncertainty Processing (WUPES'09), Sept. 2009, 202--212, University of Economics, Prague. Mixtures of Polynomials in Hybrid

More information

Inference in Hybrid Bayesian Networks with Nonlinear Deterministic Conditionals

Inference in Hybrid Bayesian Networks with Nonlinear Deterministic Conditionals KU SCHOOL OF BUSINESS WORKING PAPER NO. 328 Inference in Hybrid Bayesian Networks with Nonlinear Deterministic Conditionals Barry R. Cobb 1 cobbbr@vmi.edu Prakash P. Shenoy 2 pshenoy@ku.edu 1 Department

More information

Two Issues in Using Mixtures of Polynomials for Inference in Hybrid Bayesian Networks

Two Issues in Using Mixtures of Polynomials for Inference in Hybrid Bayesian Networks Accepted for publication in: International Journal of Approximate Reasoning, 2012, Two Issues in Using Mixtures of Polynomials for Inference in Hybrid Bayesian

More information

SCHOOL OF BUSINESS WORKING PAPER NO Inference in Hybrid Bayesian Networks Using Mixtures of Polynomials. Prakash P. Shenoy. and. James C.

SCHOOL OF BUSINESS WORKING PAPER NO Inference in Hybrid Bayesian Networks Using Mixtures of Polynomials. Prakash P. Shenoy. and. James C. Appeared in: International Journal of Approximate Reasoning, Vol. 52, No. 5, July 2011, pp. 641--657. SCHOOL OF BUSINESS WORKING PAPER NO. 321 Inference in Hybrid Bayesian Networks Using Mixtures of Polynomials

More information

SOLVING STOCHASTIC PERT NETWORKS EXACTLY USING HYBRID BAYESIAN NETWORKS

SOLVING STOCHASTIC PERT NETWORKS EXACTLY USING HYBRID BAYESIAN NETWORKS SOLVING STOCHASTIC PERT NETWORKS EXACTLY USING HYBRID BAYESIAN NETWORKS Esma Nur Cinicioglu and Prakash P. Shenoy School of Business University of Kansas, Lawrence, KS 66045 USA esmanur@ku.edu, pshenoy@ku.edu

More information

Solving Hybrid Influence Diagrams with Deterministic Variables

Solving Hybrid Influence Diagrams with Deterministic Variables 322 LI & SHENOY UAI 2010 Solving Hybrid Influence Diagrams with Deterministic Variables Yijing Li and Prakash P. Shenoy University of Kansas, School of Business 1300 Sunnyside Ave., Summerfield Hall Lawrence,

More information

Inference in hybrid Bayesian networks with Mixtures of Truncated Basis Functions

Inference in hybrid Bayesian networks with Mixtures of Truncated Basis Functions Inference in hybrid Bayesian networks with Mixtures of Truncated Basis Functions Helge Langseth Department of Computer and Information Science The Norwegian University of Science and Technology Trondheim

More information

Maximum Likelihood vs. Least Squares for Estimating Mixtures of Truncated Exponentials

Maximum Likelihood vs. Least Squares for Estimating Mixtures of Truncated Exponentials Maximum Likelihood vs. Least Squares for Estimating Mixtures of Truncated Exponentials Helge Langseth 1 Thomas D. Nielsen 2 Rafael Rumí 3 Antonio Salmerón 3 1 Department of Computer and Information Science

More information

Belief Update in CLG Bayesian Networks With Lazy Propagation

Belief Update in CLG Bayesian Networks With Lazy Propagation Belief Update in CLG Bayesian Networks With Lazy Propagation Anders L Madsen HUGIN Expert A/S Gasværksvej 5 9000 Aalborg, Denmark Anders.L.Madsen@hugin.com Abstract In recent years Bayesian networks (BNs)

More information

Mixtures of Truncated Basis Functions

Mixtures of Truncated Basis Functions Mixtures of Truncated Basis Functions Helge Langseth, Thomas D. Nielsen, Rafael Rumí, and Antonio Salmerón This work is supported by an Abel grant from Iceland, Liechtenstein, and Norway through the EEA

More information

Dynamic importance sampling in Bayesian networks using factorisation of probability trees

Dynamic importance sampling in Bayesian networks using factorisation of probability trees Dynamic importance sampling in Bayesian networks using factorisation of probability trees Irene Martínez Department of Languages and Computation University of Almería Almería, 4, Spain Carmelo Rodríguez

More information

Parameter learning in MTE networks using incomplete data

Parameter learning in MTE networks using incomplete data Parameter learning in MTE networks using incomplete data Antonio Fernández Dept. of Statistics and Applied Mathematics University of Almería, Spain afalvarez@ual.es Thomas Dyhre Nielsen Dept. of Computer

More information

2 Functions of random variables

2 Functions of random variables 2 Functions of random variables A basic statistical model for sample data is a collection of random variables X 1,..., X n. The data are summarised in terms of certain sample statistics, calculated as

More information

p L yi z n m x N n xi

p L yi z n m x N n xi y i z n x n N x i Overview Directed and undirected graphs Conditional independence Exact inference Latent variables and EM Variational inference Books statistical perspective Graphical Models, S. Lauritzen

More information

Generalized Evidence Pre-propagated Importance Sampling for Hybrid Bayesian Networks

Generalized Evidence Pre-propagated Importance Sampling for Hybrid Bayesian Networks Generalized Evidence Pre-propagated Importance Sampling for Hybrid Bayesian Networks Changhe Yuan Department of Computer Science and Engineering Mississippi State University Mississippi State, MS 39762

More information

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable Lecture Notes 1 Probability and Random Variables Probability Spaces Conditional Probability and Independence Random Variables Functions of a Random Variable Generation of a Random Variable Jointly Distributed

More information

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable Lecture Notes 1 Probability and Random Variables Probability Spaces Conditional Probability and Independence Random Variables Functions of a Random Variable Generation of a Random Variable Jointly Distributed

More information

Probabilistic Graphical Models (I)

Probabilistic Graphical Models (I) Probabilistic Graphical Models (I) Hongxin Zhang zhx@cad.zju.edu.cn State Key Lab of CAD&CG, ZJU 2015-03-31 Probabilistic Graphical Models Modeling many real-world problems => a large number of random

More information

1 Probability and Random Variables

1 Probability and Random Variables 1 Probability and Random Variables The models that you have seen thus far are deterministic models. For any time t, there is a unique solution X(t). On the other hand, stochastic models will result in

More information

Approximate Message Passing

Approximate Message Passing Approximate Message Passing Mohammad Emtiyaz Khan CS, UBC February 8, 2012 Abstract In this note, I summarize Sections 5.1 and 5.2 of Arian Maleki s PhD thesis. 1 Notation We denote scalars by small letters

More information

Chapter 3 sections. SKIP: 3.10 Markov Chains. SKIP: pages Chapter 3 - continued

Chapter 3 sections. SKIP: 3.10 Markov Chains. SKIP: pages Chapter 3 - continued Chapter 3 sections Chapter 3 - continued 3.1 Random Variables and Discrete Distributions 3.2 Continuous Distributions 3.3 The Cumulative Distribution Function 3.4 Bivariate Distributions 3.5 Marginal Distributions

More information

Hybrid Copula Bayesian Networks

Hybrid Copula Bayesian Networks Kiran Karra kiran.karra@vt.edu Hume Center Electrical and Computer Engineering Virginia Polytechnic Institute and State University September 7, 2016 Outline Introduction Prior Work Introduction to Copulas

More information

Northwestern University Department of Electrical Engineering and Computer Science

Northwestern University Department of Electrical Engineering and Computer Science Northwestern University Department of Electrical Engineering and Computer Science EECS 454: Modeling and Analysis of Communication Networks Spring 2008 Probability Review As discussed in Lecture 1, probability

More information

UNIT Define joint distribution and joint probability density function for the two random variables X and Y.

UNIT Define joint distribution and joint probability density function for the two random variables X and Y. UNIT 4 1. Define joint distribution and joint probability density function for the two random variables X and Y. Let and represent the probability distribution functions of two random variables X and Y

More information

Hybrid Loopy Belief Propagation

Hybrid Loopy Belief Propagation Hybrid Loopy Belief Propagation Changhe Yuan Department of Computer Science and Engineering Mississippi State University Mississippi State, MS 39762 cyuan@cse.msstate.edu Abstract Mare J. Druzdzel Decision

More information

Random Variables and Their Distributions

Random Variables and Their Distributions Chapter 3 Random Variables and Their Distributions A random variable (r.v.) is a function that assigns one and only one numerical value to each simple event in an experiment. We will denote r.vs by capital

More information

Bayesian Networks to design optimal experiments. Davide De March

Bayesian Networks to design optimal experiments. Davide De March Bayesian Networks to design optimal experiments Davide De March davidedemarch@gmail.com 1 Outline evolutionary experimental design in high-dimensional space and costly experimentation the microwell mixture

More information

Multivariate Statistics

Multivariate Statistics Multivariate Statistics Chapter 2: Multivariate distributions and inference Pedro Galeano Departamento de Estadística Universidad Carlos III de Madrid pedro.galeano@uc3m.es Course 2016/2017 Master in Mathematical

More information

1 Review of Probability and Distributions

1 Review of Probability and Distributions Random variables. A numerically valued function X of an outcome ω from a sample space Ω X : Ω R : ω X(ω) is called a random variable (r.v.), and usually determined by an experiment. We conventionally denote

More information

Inference in Hybrid Bayesian Networks using dynamic discretisation

Inference in Hybrid Bayesian Networks using dynamic discretisation Inference in Hybrid Bayesian Networks using dynamic discretisation Martin Neil, Manesh Tailor and David Marquez Department of Computer Science, Queen Mary, University of London Agena Ltd Abstract We consider

More information

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016 Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2016 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several

More information

Chapter 3 sections. SKIP: 3.10 Markov Chains. SKIP: pages Chapter 3 - continued

Chapter 3 sections. SKIP: 3.10 Markov Chains. SKIP: pages Chapter 3 - continued Chapter 3 sections 3.1 Random Variables and Discrete Distributions 3.2 Continuous Distributions 3.3 The Cumulative Distribution Function 3.4 Bivariate Distributions 3.5 Marginal Distributions 3.6 Conditional

More information

Probabilistic Graphical Networks: Definitions and Basic Results

Probabilistic Graphical Networks: Definitions and Basic Results This document gives a cursory overview of Probabilistic Graphical Networks. The material has been gleaned from different sources. I make no claim to original authorship of this material. Bayesian Graphical

More information

13 : Variational Inference: Loopy Belief Propagation and Mean Field

13 : Variational Inference: Loopy Belief Propagation and Mean Field 10-708: Probabilistic Graphical Models 10-708, Spring 2012 13 : Variational Inference: Loopy Belief Propagation and Mean Field Lecturer: Eric P. Xing Scribes: Peter Schulam and William Wang 1 Introduction

More information

Continuous Random Variables

Continuous Random Variables 1 / 24 Continuous Random Variables Saravanan Vijayakumaran sarva@ee.iitb.ac.in Department of Electrical Engineering Indian Institute of Technology Bombay February 27, 2013 2 / 24 Continuous Random Variables

More information

Preliminary Statistics Lecture 2: Probability Theory (Outline) prelimsoas.webs.com

Preliminary Statistics Lecture 2: Probability Theory (Outline) prelimsoas.webs.com 1 School of Oriental and African Studies September 2015 Department of Economics Preliminary Statistics Lecture 2: Probability Theory (Outline) prelimsoas.webs.com Gujarati D. Basic Econometrics, Appendix

More information

Learning Mixtures of Truncated Basis Functions from Data

Learning Mixtures of Truncated Basis Functions from Data Learning Mixtures of Truncated Basis Functions from Data Helge Langseth, Thomas D. Nielsen, and Antonio Salmerón PGM This work is supported by an Abel grant from Iceland, Liechtenstein, and Norway through

More information

Random Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R

Random Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R In probabilistic models, a random variable is a variable whose possible values are numerical outcomes of a random phenomenon. As a function or a map, it maps from an element (or an outcome) of a sample

More information

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science Algorithms For Inference Fall 2014

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science Algorithms For Inference Fall 2014 Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms For Inference Fall 2014 Problem Set 3 Issued: Thursday, September 25, 2014 Due: Thursday,

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Lecture Notes Fall 2009 November, 2009 Byoung-Ta Zhang School of Computer Science and Engineering & Cognitive Science, Brain Science, and Bioinformatics Seoul National University

More information

Directed and Undirected Graphical Models

Directed and Undirected Graphical Models Directed and Undirected Davide Bacciu Dipartimento di Informatica Università di Pisa bacciu@di.unipi.it Machine Learning: Neural Networks and Advanced Models (AA2) Last Lecture Refresher Lecture Plan Directed

More information

Sequence labeling. Taking collective a set of interrelated instances x 1,, x T and jointly labeling them

Sequence labeling. Taking collective a set of interrelated instances x 1,, x T and jointly labeling them HMM, MEMM and CRF 40-957 Special opics in Artificial Intelligence: Probabilistic Graphical Models Sharif University of echnology Soleymani Spring 2014 Sequence labeling aking collective a set of interrelated

More information

Parameter Estimation in Mixtures of Truncated Exponentials

Parameter Estimation in Mixtures of Truncated Exponentials Parameter Estimation in Mixtures of Truncated Exponentials Helge Langseth Department of Computer and Information Science The Norwegian University of Science and Technology, Trondheim (Norway) helgel@idi.ntnu.no

More information

Inference in hybrid Bayesian networks with mixtures of truncated exponentials

Inference in hybrid Bayesian networks with mixtures of truncated exponentials International Journal of Approximate Reasoning 41 (2006) 257 26 www.elsevier.com/locate/ijar Inference in hybrid Bayesian networks with mixtures of truncated exponentials Barry R. Cobb a, *, Prakash P.

More information

A New Approach for Hybrid Bayesian Networks Using Full Densities

A New Approach for Hybrid Bayesian Networks Using Full Densities A New Approach for Hybrid Bayesian Networks Using Full Densities Oliver C. Schrempf Institute of Computer Design and Fault Tolerance Universität Karlsruhe e-mail: schrempf@ira.uka.de Uwe D. Hanebeck Institute

More information

1 Joint and marginal distributions

1 Joint and marginal distributions DECEMBER 7, 204 LECTURE 2 JOINT (BIVARIATE) DISTRIBUTIONS, MARGINAL DISTRIBUTIONS, INDEPENDENCE So far we have considered one random variable at a time. However, in economics we are typically interested

More information

Stat 521A Lecture 18 1

Stat 521A Lecture 18 1 Stat 521A Lecture 18 1 Outline Cts and discrete variables (14.1) Gaussian networks (14.2) Conditional Gaussian networks (14.3) Non-linear Gaussian networks (14.4) Sampling (14.5) 2 Hybrid networks A hybrid

More information

Introduction to Probabilistic Graphical Models

Introduction to Probabilistic Graphical Models Introduction to Probabilistic Graphical Models Sargur Srihari srihari@cedar.buffalo.edu 1 Topics 1. What are probabilistic graphical models (PGMs) 2. Use of PGMs Engineering and AI 3. Directionality in

More information

Probability and Distributions

Probability and Distributions Probability and Distributions What is a statistical model? A statistical model is a set of assumptions by which the hypothetical population distribution of data is inferred. It is typically postulated

More information

ECE521 Tutorial 11. Topic Review. ECE521 Winter Credits to Alireza Makhzani, Alex Schwing, Rich Zemel and TAs for slides. ECE521 Tutorial 11 / 4

ECE521 Tutorial 11. Topic Review. ECE521 Winter Credits to Alireza Makhzani, Alex Schwing, Rich Zemel and TAs for slides. ECE521 Tutorial 11 / 4 ECE52 Tutorial Topic Review ECE52 Winter 206 Credits to Alireza Makhzani, Alex Schwing, Rich Zemel and TAs for slides ECE52 Tutorial ECE52 Winter 206 Credits to Alireza / 4 Outline K-means, PCA 2 Bayesian

More information

Probabilistic Graphical Models

Probabilistic Graphical Models 2016 Robert Nowak Probabilistic Graphical Models 1 Introduction We have focused mainly on linear models for signals, in particular the subspace model x = Uθ, where U is a n k matrix and θ R k is a vector

More information

Introduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak

Introduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak Introduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak 1 Introduction. Random variables During the course we are interested in reasoning about considered phenomenon. In other words,

More information

Basics on Probability. Jingrui He 09/11/2007

Basics on Probability. Jingrui He 09/11/2007 Basics on Probability Jingrui He 09/11/2007 Coin Flips You flip a coin Head with probability 0.5 You flip 100 coins How many heads would you expect Coin Flips cont. You flip a coin Head with probability

More information

MAS223 Statistical Inference and Modelling Exercises

MAS223 Statistical Inference and Modelling Exercises MAS223 Statistical Inference and Modelling Exercises The exercises are grouped into sections, corresponding to chapters of the lecture notes Within each section exercises are divided into warm-up questions,

More information

Multivariate random variables

Multivariate random variables Multivariate random variables DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Joint distributions Tool to characterize several

More information

Importance Sampling for General Hybrid Bayesian Networks

Importance Sampling for General Hybrid Bayesian Networks Importance Sampling for General Hybrid Bayesian Networks Changhe Yuan Department of Computer Science and Engineering Mississippi State University Mississippi State, MS 39762 cyuan@cse.msstate.edu Abstract

More information

INFERENCE IN HYBRID BAYESIAN NETWORKS

INFERENCE IN HYBRID BAYESIAN NETWORKS INFERENCE IN HYBRID BAYESIAN NETWORKS Helge Langseth, a Thomas D. Nielsen, b Rafael Rumí, c and Antonio Salmerón c a Department of Information and Computer Sciences, Norwegian University of Science and

More information

{ p if x = 1 1 p if x = 0

{ p if x = 1 1 p if x = 0 Discrete random variables Probability mass function Given a discrete random variable X taking values in X = {v 1,..., v m }, its probability mass function P : X [0, 1] is defined as: P (v i ) = Pr[X =

More information

Lecture 9: PGM Learning

Lecture 9: PGM Learning 13 Oct 2014 Intro. to Stats. Machine Learning COMP SCI 4401/7401 Table of Contents I Learning parameters in MRFs 1 Learning parameters in MRFs Inference and Learning Given parameters (of potentials) and

More information

CSC2535: Computation in Neural Networks Lecture 7: Variational Bayesian Learning & Model Selection

CSC2535: Computation in Neural Networks Lecture 7: Variational Bayesian Learning & Model Selection CSC2535: Computation in Neural Networks Lecture 7: Variational Bayesian Learning & Model Selection (non-examinable material) Matthew J. Beal February 27, 2004 www.variational-bayes.org Bayesian Model Selection

More information

13: Variational inference II

13: Variational inference II 10-708: Probabilistic Graphical Models, Spring 2015 13: Variational inference II Lecturer: Eric P. Xing Scribes: Ronghuo Zheng, Zhiting Hu, Yuntian Deng 1 Introduction We started to talk about variational

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Lecture 11 CRFs, Exponential Family CS/CNS/EE 155 Andreas Krause Announcements Homework 2 due today Project milestones due next Monday (Nov 9) About half the work should

More information

Integrating Correlated Bayesian Networks Using Maximum Entropy

Integrating Correlated Bayesian Networks Using Maximum Entropy Applied Mathematical Sciences, Vol. 5, 2011, no. 48, 2361-2371 Integrating Correlated Bayesian Networks Using Maximum Entropy Kenneth D. Jarman Pacific Northwest National Laboratory PO Box 999, MSIN K7-90

More information

Perhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows.

Perhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows. Chapter 5 Two Random Variables In a practical engineering problem, there is almost always causal relationship between different events. Some relationships are determined by physical laws, e.g., voltage

More information

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014 Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2014 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several

More information

Machine Learning Lecture Notes

Machine Learning Lecture Notes Machine Learning Lecture Notes Predrag Radivojac January 3, 25 Random Variables Until now we operated on relatively simple sample spaces and produced measure functions over sets of outcomes. In many situations,

More information

Spring 2012 Math 541B Exam 1

Spring 2012 Math 541B Exam 1 Spring 2012 Math 541B Exam 1 1. A sample of size n is drawn without replacement from an urn containing N balls, m of which are red and N m are black; the balls are otherwise indistinguishable. Let X denote

More information

Morphing the Hugin and Shenoy Shafer Architectures

Morphing the Hugin and Shenoy Shafer Architectures Morphing the Hugin and Shenoy Shafer Architectures James D. Park and Adnan Darwiche UCLA, Los Angeles CA 995, USA, {jd,darwiche}@cs.ucla.edu Abstrt. he Hugin and Shenoy Shafer architectures are two variations

More information

THE VINE COPULA METHOD FOR REPRESENTING HIGH DIMENSIONAL DEPENDENT DISTRIBUTIONS: APPLICATION TO CONTINUOUS BELIEF NETS

THE VINE COPULA METHOD FOR REPRESENTING HIGH DIMENSIONAL DEPENDENT DISTRIBUTIONS: APPLICATION TO CONTINUOUS BELIEF NETS Proceedings of the 00 Winter Simulation Conference E. Yücesan, C.-H. Chen, J. L. Snowdon, and J. M. Charnes, eds. THE VINE COPULA METHOD FOR REPRESENTING HIGH DIMENSIONAL DEPENDENT DISTRIBUTIONS: APPLICATION

More information

Intelligent Systems:

Intelligent Systems: Intelligent Systems: Undirected Graphical models (Factor Graphs) (2 lectures) Carsten Rother 15/01/2015 Intelligent Systems: Probabilistic Inference in DGM and UGM Roadmap for next two lectures Definition

More information

PCMI Introduction to Random Matrix Theory Handout # REVIEW OF PROBABILITY THEORY. Chapter 1 - Events and Their Probabilities

PCMI Introduction to Random Matrix Theory Handout # REVIEW OF PROBABILITY THEORY. Chapter 1 - Events and Their Probabilities PCMI 207 - Introduction to Random Matrix Theory Handout #2 06.27.207 REVIEW OF PROBABILITY THEORY Chapter - Events and Their Probabilities.. Events as Sets Definition (σ-field). A collection F of subsets

More information

Probability and Information Theory. Sargur N. Srihari

Probability and Information Theory. Sargur N. Srihari Probability and Information Theory Sargur N. srihari@cedar.buffalo.edu 1 Topics in Probability and Information Theory Overview 1. Why Probability? 2. Random Variables 3. Probability Distributions 4. Marginal

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear

More information

Factor Graphs and Message Passing Algorithms Part 1: Introduction

Factor Graphs and Message Passing Algorithms Part 1: Introduction Factor Graphs and Message Passing Algorithms Part 1: Introduction Hans-Andrea Loeliger December 2007 1 The Two Basic Problems 1. Marginalization: Compute f k (x k ) f(x 1,..., x n ) x 1,..., x n except

More information

A Polynomial Chaos Approach to Robust Multiobjective Optimization

A Polynomial Chaos Approach to Robust Multiobjective Optimization A Polynomial Chaos Approach to Robust Multiobjective Optimization Silvia Poles 1, Alberto Lovison 2 1 EnginSoft S.p.A., Optimization Consulting Via Giambellino, 7 35129 Padova, Italy s.poles@enginsoft.it

More information

Recitation 2: Probability

Recitation 2: Probability Recitation 2: Probability Colin White, Kenny Marino January 23, 2018 Outline Facts about sets Definitions and facts about probability Random Variables and Joint Distributions Characteristics of distributions

More information

Discrete Bayesian Networks: The Exact Posterior Marginal Distributions

Discrete Bayesian Networks: The Exact Posterior Marginal Distributions arxiv:1411.6300v1 [cs.ai] 23 Nov 2014 Discrete Bayesian Networks: The Exact Posterior Marginal Distributions Do Le (Paul) Minh Department of ISDS, California State University, Fullerton CA 92831, USA dminh@fullerton.edu

More information

Unsupervised Learning

Unsupervised Learning Unsupervised Learning Bayesian Model Comparison Zoubin Ghahramani zoubin@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit, and MSc in Intelligent Systems, Dept Computer Science University College

More information

Exact distribution theory for belief net responses

Exact distribution theory for belief net responses Exact distribution theory for belief net responses Peter M. Hooper Department of Mathematical and Statistical Sciences University of Alberta Edmonton, Canada, T6G 2G1 hooper@stat.ualberta.ca May 2, 2008

More information

Learning Mixtures of Truncated Basis Functions from Data

Learning Mixtures of Truncated Basis Functions from Data Learning Mixtures of Truncated Basis Functions from Data Helge Langseth Department of Computer and Information Science The Norwegian University of Science and Technology Trondheim (Norway) helgel@idi.ntnu.no

More information

EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS

EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS Lecture 16, 6/1/2005 University of Washington, Department of Electrical Engineering Spring 2005 Instructor: Professor Jeff A. Bilmes Uncertainty & Bayesian Networks

More information

Naïve Bayes classification

Naïve Bayes classification Naïve Bayes classification 1 Probability theory Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. Examples: A person s height, the outcome of a coin toss

More information

Probabilistic Graphical Models. Guest Lecture by Narges Razavian Machine Learning Class April

Probabilistic Graphical Models. Guest Lecture by Narges Razavian Machine Learning Class April Probabilistic Graphical Models Guest Lecture by Narges Razavian Machine Learning Class April 14 2017 Today What is probabilistic graphical model and why it is useful? Bayesian Networks Basic Inference

More information

Introduction to Probabilistic Graphical Models

Introduction to Probabilistic Graphical Models Introduction to Probabilistic Graphical Models Franz Pernkopf, Robert Peharz, Sebastian Tschiatschek Graz University of Technology, Laboratory of Signal Processing and Speech Communication Inffeldgasse

More information

CUADERNOS DE TRABAJO ESCUELA UNIVERSITARIA DE ESTADÍSTICA

CUADERNOS DE TRABAJO ESCUELA UNIVERSITARIA DE ESTADÍSTICA CUADERNOS DE TRABAJO ESCUELA UNIVERSITARIA DE ESTADÍSTICA Inaccurate parameters in Gaussian Bayesian networks Miguel A. Gómez-Villegas Paloma Main Rosario Susi Cuaderno de Trabajo número 02/2008 Los Cuadernos

More information

A Probability Review

A Probability Review A Probability Review Outline: A probability review Shorthand notation: RV stands for random variable EE 527, Detection and Estimation Theory, # 0b 1 A Probability Review Reading: Go over handouts 2 5 in

More information

Continuous Random Variables and Continuous Distributions

Continuous Random Variables and Continuous Distributions Continuous Random Variables and Continuous Distributions Continuous Random Variables and Continuous Distributions Expectation & Variance of Continuous Random Variables ( 5.2) The Uniform Random Variable

More information

NONLINEAR CLASSIFICATION AND REGRESSION. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition

NONLINEAR CLASSIFICATION AND REGRESSION. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition NONLINEAR CLASSIFICATION AND REGRESSION Nonlinear Classification and Regression: Outline 2 Multi-Layer Perceptrons The Back-Propagation Learning Algorithm Generalized Linear Models Radial Basis Function

More information

Uniform Correlation Mixture of Bivariate Normal Distributions and. Hypercubically-contoured Densities That Are Marginally Normal

Uniform Correlation Mixture of Bivariate Normal Distributions and. Hypercubically-contoured Densities That Are Marginally Normal Uniform Correlation Mixture of Bivariate Normal Distributions and Hypercubically-contoured Densities That Are Marginally Normal Kai Zhang Department of Statistics and Operations Research University of

More information

Bayes spaces: use of improper priors and distances between densities

Bayes spaces: use of improper priors and distances between densities Bayes spaces: use of improper priors and distances between densities J. J. Egozcue 1, V. Pawlowsky-Glahn 2, R. Tolosana-Delgado 1, M. I. Ortego 1 and G. van den Boogaart 3 1 Universidad Politécnica de

More information