Operations for Inference in Continuous Bayesian Networks with Linear Deterministic Variables

Size: px

Start display at page:

Download "Operations for Inference in Continuous Bayesian Networks with Linear Deterministic Variables"

Holly Norton
6 years ago
Views:

1 Appeared in: International Journal of Approximate Reasoning, Vol. 42, Nos. 1--2, 2006, pp Operations for Inference in Continuous Bayesian Networks with Linear Deterministic Variables Barry R. Cobb a, Prakash P. Shenoy b a Department of Economics and Business, Virginia Military Institute, Lexington, Virginia, 24450, U.S.A b School of Business, University of Kansas, 1300 Sunnyside Ave., Summerfield Hall, Lawrence, Kansas, , U.S.A Abstract An important class of continuous Bayesian networks are those that have linear conditionally deterministic variables (a variable that is a linear deterministic function of its parents). In this case, the joint density function for the variables in the network does not exist. Conditional linear Gaussian (CLG) distributions can handle such cases when all variables are normally distributed. In this paper, we develop operations required for performing inference with linear conditionally deterministic variables in continuous Bayesian networks using relationships derived from joint cumulative distribution functions (CDF s). These methods allow inference in networks with linear deterministic variables and non-gaussian distributions. Key words: Bayesian networks, conditional linear Gaussian models, deterministic variables PACS: 1 Introduction Bayesian networks model knowledge about propositions in uncertain domains using graphical and numerical representations. At the qualitative level, a Bayesian network is a directed acyclic graph where nodes represent variables and the (missing) edges represent conditional independence relations among Corresponding author. addresses: cobbbr@vmi.edu (Barry R. Cobb), pshenoy@ku.edu (Prakash P. Shenoy). Preprint submitted to Elsevier Science 3 November 2005

2 the variables. At the numerical level, a Bayesian network consists of a factorization of a joint probability distribution into a set of conditional distributions, one for each variable in the network. Continuous Bayesian networks contain variables whose state spaces are uncountable. A commonly used type of Bayesian network which accomodates continuous variables is the conditional linear Gaussian (CLG) model [5,7]. In CLG models, the distribution of a continuous variable is a linear Gaussian function of its continuous parents. The scheme originally developed by Lauritzen [7] allowed exact computation of means and variances in CLG networks when the conditional distribution of a variable given its continuous parents has a positive variance; however, this algorithm did not always compute the exact marginal densities of continuous variables. A new computational scheme for CLG models was developed by Lauritzen and Jensen [8]. To find full local marginals, this scheme places some restrictions on the construction and initialization of junction trees. An important class of continuous Bayesian networks are those that have linear conditionally deterministic variables (a variable that is a deterministic function of its parents). In this case, the joint density function for the variables in the network does not exist. CLG models can handle such cases when all variables are normally distributed. However, for models where continuous variables are not normally distributed, methods for carrying out exact inference in networks with linear deterministic relationships have not been developed. Exact inference in hybrid Bayesian networks can be performed using mixtures of truncated exponentials (MTE) potentials [9,12]. General formulations of MTE potentials which approximate the normal probability density function (PDF) exist [1]; however, these formulations cannot be used to model a conditional distribution where the variance of a variable given values of its continuous parents is zero. In this paper, we develop inference operations for linear conditionally deterministic variables using relationships derived from joint cumulative distribution functions (CDF s). These operations allow MTE potentials to be used for inference in any continuous CLG model, as well as other models that have linear conditionally deterministic variables which are non-gaussian. The remainder of this paper is organized as follows. Section 2 introduces notation and definitions used throughout the paper. Section 3 introduces techniques for using CDF s to construct PDF s for deterministic variables. Section 4 introduces join tree operations for linear deterministic variables. Section 5 contains an example of inference in a continuous Bayesian network containing linear deterministic variables. Section 6 summarizes and states directions for future research. 2

3 X Y Fig. 1. Graphical representation of the conditionally deterministic relationship of Y given X determined by the CMF p Y x. 2 Notation and Definitions This section contains notation and definitions that will be used throughout the remainder of the paper. 2.1 Notation Random variables in a Bayesian network will be denoted by capital letters, e.g., A, B, C. Sets of variables will be denoted by boldface capital letters, e.g., X. All variables in this paper are assumed to take values in uncountable (continuous) state spaces. If X is a set of variables, x is a configuration of specific states of those variables. The continuous state space of X is denoted by Ω X. MTE probability potentials are denoted by lower-case greek letters, e.g., α, β, γ. In graphical representations, continuous nodes in Bayesian networks are represented by double-border ovals. Variables that are deterministic functions of their parents are represented by triple-border ovals. Shaded nodes are degenerate, indicating that evidence has restricted the variable to one value. 2.2 Conditional Mass Function (CMF) When relationships between continuous variables are deterministic, the joint PDF does not exist. If Y is a deterministic relationship of variables in X, i.e. Y = g(x), the conditional mass function (CMF) for {Y x} is defined as p Y x = 1{y = g(x)}, (1) where 1{A} is the indicator function of the event A, i.e. 1{A}(B) =1ifB = A and 0 otherwise. Graphically, the conditionally deterministic relationship of Y given X is represented in a Bayesian network model as shown in Figure 1, where X consists of a single continuous variable X. 3

4 2.3 Mixtures of Truncated Exponentials A mixture of truncated exponentials (MTE) potential [9,12] has the following definition. MTE potential. LetX =(X 1,...,X n )beann-dimensional random variable. A function φ :Ω X R + is an MTE potential if one of the next two conditions holds: (1) The potential φ can be written as m φ(x) =a 0 + a i exp{ n i=1 j=1 b (j) i x j } (2) for all x Ω X,wherea i,i=0,...,m and b (j) i, i =1,...,m, j =1,...,n are real numbers. (2) The domain of the variables, Ω X, is partitioned into hypercubes {Ω 1 X,...,Ω k X} such that φ is defined as φ(x) =φ i (x) if x Ω Xi, i =1,...,k, (3) where each φ i,i=1,..., k can be written in the form of equation (2). In the definition above, k is the number of pieces and m is the number of exponential terms in each piece of the MTE potential. In this paper, all MTE potentials are equal to zero in unspecified regions. Moral et al. [10] proposes an iterative algorithm based on least squares approximation to estimate MTE potentials from data. Moral et al. [11] describes a method to approximate conditional MTE potentials using a mixed tree structure. Cobb et al. [4] describes a nonlinear optimization procedure used to fit MTE parameters for approximations to standard PDF s, including the uniform, exponential, gamma, beta, and lognormal distributions. Inference in continuous Bayesian networks where all conditional probability distributions are approximated by MTE potentials is performed using the operations of restriction, combination, and marginalization, as defined by Moral et al. [9] and further described by Cobb and Shenoy [1]. Restriction involves substituting real numbers representing evidence into a potential. Combination of two MTE potentials is pointwise multiplication. If two MTE potentials for X are denoted by φ and ψ, the combination of these two potentials is denoted by φ ψ. Marginalization of a variable from an MTE potential is closed-form integration. If an MTE potential for X is denoted by φ, the marginalization of a variable X X from φ is denoted by φ X. Operations used to marginalize a variable from the combination of an MTE 4

5 potential and a CMF are defined later in the paper. 3 Using CDF s to Construct PDF s for Deterministic Variables This section contains standard results from probability theory which describe methods of constructing CDF s and their corresponding PDF s for variables that are deterministic functions of their parents. These results are re-stated here for completeness. 3.1 Monotonically Increasing Functions Consider a random variable Y which is a monotonically increasing deterministic function of a random variable X. A Bayesian network representing this relationship is shown in Figure 1. The joint CDF for {X, Y } represents the following probability: F X,Y (x, y)=p [X x, Y y] = P [X x]p [Y y X x] = F X (x)p [Y y X x] for any x Ω X such that F X (x) > 0. When Y is a monotonically increasing function of X, X = g 1 (Y )andp [Y y X x] =1,thusF X,Y (x, y) =F X (g 1 (y)). Allowing x go to infinity in both sides of this expression gives F X,Y (,y)orf Y (y) = F X (g 1 (y)). Differentiating both sides of this expression with respect to y (using the chain rule on the right-hand side) yields f Y (y) =f X (g 1 (y)) d dy (g 1 (y)). (4) Thus, the Bayesian network where X = g 1 (Y )andf Y (y) meets the above condition will have the same CDF of the original Bayesian network and the Bayesian networks are equivalent, as stated in the following proposition. Proposition 1. Suppose we have a Bayesian network with two variables X and Y with an arrow from X to Y where Y is a conditionally deterministic, monotonically increasing function of X. Then, the equivalent Bayesian network with an arrow from Y to X where X is a conditionally deterministic 5

6 fx ( x) X Before Arc Reversal 1{ y= g( x)} Y 1 1{ x= g ( y)} After Arc Reversal d f y g y f g y dy 1 1 Y( ) = ( ) X( ( )) X Y Fig. 2. Graphical representation of the conditionally deterministic relationship of X on Y before and after performing an arc reversal on the Bayesian network of Figure 1. function of Y meets the conditions that f Y (y) =f X (g 1 (y)) d dy (g 1 (y)) and X = g 1 (Y ). When Y is a monotonically increasing (and therefore invertible) deterministic function of X, Proposition 1 gives a shortcut to finding the PDF of Y from the PDF of X that does not require the CDF of Y to be computed. We refer to using the operation in Proposition 1 as performing an arc reversal on the Bayesian network. After the operation is performed, the Bayesian network appears as in Figure 2. Example 1. Suppose that a random variable X has PDF 3x 2 if 0 <x<1 f X (x) = 0 elsewhere, and we want to find f Y (y) ify = g(x) =4X 2. Note that f X (g 1 (y)) = 3y/4 and d dy (g 1 (y)) = 1/(4 y). Using Proposition 1, we compute 3y f Y (y) = = 3 y if 0 <y<4 y 16 0 elsewhere. 6

7 3.2 Monotonically Decreasing Functions Consider a random variable Y which is a monotonically decreasing function of a random variable X. A Bayesian network representing this relationship is shown in Figure 1. The joint CDF for {X, Y } represents the following probability: F X,Y (x, y)=p [X x, Y y] = P [X x] P [x g 1 (y)] = F X (x) F X (g 1 (y)). for any x Ω X such that F X (x) > 0. When Y is a monotonically decreasing function of X, X = g 1 (Y ), thus F X,Y (x, y) =F X (x) F X (g 1 (y)). Allowing x go to infinity in both sides of the last line of the expression above gives F X,Y (,y)orf Y (y) =1 F X (g 1 (y)). Differentiating both sides of this expression with respect to y (using the chain rule on the right-hand side) yields f Y (y) = f X (g 1 (y)) d dy (g 1 (y)). (5) Thus, the Bayesian network where X = g 1 (Y )andf Y (y) meets the above condition will have the same CDF of the original Bayesian network and the Bayesian networks are equivalent, as stated in the following proposition. Proposition 2. Suppose we have a Bayesian network with two variables X and Y with an arrow from X to Y where Y is a conditionally deterministic, monotonically decreasing function of X. Then, the equivalent Bayesian network with an arrow from Y to X where X is a conditionally deterministic function of Y meets the conditions that f Y (y) = f X (g 1 (y)) d dy (g 1 (y)) and X = g 1 (Y ). When Y is a monotonically decreasing (and therefore invertible) deterministic function of X, Proposition 2 gives a shortcut to finding the PDF of Y from the PDF of X that does not require the CDF of Y to be computed. As in the monotonically increasing case, we refer to use of the operation in Proposition 2 as an arc reversal. Example 2. Let X have the uniform PDF over the unit interval, i.e. X U(0, 1). Find f Y (y) ify = g(x) = ln X λ. 7

8 Note that f X (g 1 (y)) = 1 and d dy (g 1 (y)) = λe λy. Using Proposition 2, we compute ( 1) λe λy = λe λy if 0 <y< f Y (y) = 0 elsewhere, 3.3 Linear CDF Marginalization Operator Suppose Y is a conditionally deterministic linear function of X, i.e. Y = g(x) =ax + b, a 0. The following operation will be used to determine the marginal PDF for Y : f Y (y) =(f X p Y x ) X (y) = 1 a f X ( ) y b. (6) a In this operation, X is marginalized from the combination of the PDF for X and the CMF for Y given X. The definition of the Linear CDF Marginalization Operator follows directly from the expressions in Propositions 1 and 2. Example 3. Suppose that a random variable X has PDF 6x(1 x) if0<x<1 f X (x) = 0 elsewhere. Find f Y (y) if the deterministic relationship Y = g(x) =2X +1isrepresented by the CMF p Y x = 1{y =2x +1}. Note that f X ( y b )=f a X( y 1 2 )=6(y 1 y 1 ) (1 ( )) = y2 +6y 9. 2 Using the operation in (6), we find the PDF for Y as 3 f Y (y) =(f X p Y x ) X 4 (y) = y2 +3y 9 if 1 <y<3 4 0 elsewhere. The following theorem is required for inference using MTE potentials in Bayesian networks with linear conditionally deterministic variables. Theorem 3. Ifφ 1 (x) isanmtepotentialforx and Y is a conditionally deterministic linear function of X represented by the CMF p Y x,then φ 2 (y) =(φ 1 p Y x ) X (y) is an MTE potential. 8

9 f ( x ) f ( y) X Yx X Y Z 1{ z= g( x, y)} Fig. 3. Graphical representation of a Bayesian network where Z is a deterministic function of X andy. Proof. Multiplication by 1/a or 1/a is multiplication by a constant. MTE potentials are closed under multiplication by constants (the constants a 0 and a i, i =1,..., m in (2) are revised). Exponential terms of the form exp{x} in φ 1 (x) are revised to be of the form exp{ 1 y b } in φ a a 2(y). Since exp{ 1 y a b } = exp{ 1y} exp{ b } and exp{ b } is a constant, the result is an a a a a MTE potential of the form in (2). 3.4 Method of Convolutions Let us consider a case where a conditionally deterministic variable has more than one parent. Let Z be a deterministic function of random variables X and Y where X has PDF f X (x) and{y x} has density f Y x (y). A Bayesian network representation of this case is shown in Figure 3. Suppose Z = g(x, Y ) is invertible in Y. Then by arguments similar to those used in Propositions 1 and 2, we can show that the Bayesian network in Figure 3 is equivalent to the Bayesian network in Figure 4 where X has PDF f X (x), {Z x} has density f Z x (z) = z g 1 (x, z) f ( Y x g 1 (x, z) ), and Y = g 1 (X, Z) is a conditionally deterministic function of X and Z. We can consider the Bayesian network in Figure 4 as being the network obtained from the Bayesian network in Figure 3 by reversing the arc (Y,Z). We 9

10 Fig. 4. Graphical representation of the Bayesian network in Figure 3 after reversal of the arc (Y, Z). can now compute the marginal density of Z as follows: f Z (z) = = ( f X (x) f Z x (z) dx f X (x) z g 1 (x, z) f ( Y x g 1 (x, z) )) dx. (7) The formula in (7) is called the method of convolutions in probability theory. The following theorem will be required for join tree operations when a variable is a linear conditionally deterministic function of its parents. Proposition 4. LetX and Y be continuous, possibly dependent random variables with joint PDF f X,Y and let Z = a 1 X + a 2 Y + b, a 2 0.The un-normalized joint PDF for {X, Z} can be found as f X,Z (x, z) f X,Y ( x, z a 1 x b a 2 ). Proof. Follows directly from the method of convolutions in probability theory. We can replace X and a 1 in Proposition 4 with a vector of variables and a vector of non-zero constants, respectively, and the result holds. A transformation of the form in Proposition 4 is referred to as a convolution of the function f X,Y (x, y) [6]. To use the convolution formula in Proposition 4 to find PDF s for linear conditionally deterministic variables in hybrid Bayesian networks with MTE potentials, the following theorem is required. 10

11 φ, X p Xi X' X\X k X i Fig. 5. A join tree for a Bayesian network with a deterministic variable. Theorem 5. Ifφ 1 is a joint MTE potential for {X, Y } and Z = a 1 X + a 2 Y + b, a 2 0, the un-normalized joint PDF φ 2 for {X, Z} calculated from the convolution of φ 1 is an MTE potential. Proof. Follows directly from the proof of Theorem 3. 4 Join Tree Operations with Linearly Deterministic Variables Suppose we have a node in a join tree for a continuous Bayesian network containing a set of variables X =(X 1,..., X N ). Assume a variable X i X is a linear deterministic function of the remaining variables X = X \ X i, i.e. X i = g(x 1,...,X i 1,X i+1,...,x N )=W + b, where W = a 1 X a i 1 X i 1 + a i+1 X i a N X N with a 1,..., a i 1,a i+1,..., a N and b defined as real numbers, with at least one of the slope coefficients in the linear equation not equal to zero. The joint PDF of X is denoted by φ. The joint PDF for X does not exist; however, we can find the marginal PDF for X i by using the operations defined in Section 3. Consider the join tree in Figure 5. The message passed from {X} to {X \ X k } (where X k X and X k X i ) is calculated using Proposition 4 as ψ(x,x i )= ( ) Xk φ p Xi \X (x,x i )=φ x i N j=1 j/ {i,k} a j x j /a j, where x =(x,x i,x k ). The message from {X} to {X\X k } to {X i } is calculated as ϕ(x i )= ψ(x,x i ) dx. Ω X 11

12 X Y Z Fig. 6. The Bayesian network for Example 5. f X p Z {x,y} X {X,Y} {X,Y,Z} Z Y f Y Fig. 7. The join tree for Example 5. The potential ϕ is normalized to find the posterior marginal for X i.ifφ is initially an MTE potential, ϕ is an MTE potential because the first message results in an MTE potential according to Theorem 5 and the second message results in an MTE potential because the class of MTE potentials is closed under marginalization. The next example utilizes the MTE approximation to the normal PDF for join tree operations with a deterministic variable in order to compare answers to Hugin software. Example 5. Consider the Bayesian network depicted in Figure 6. Suppose X N(0, 1), Y N(1, 1), and Z is a conditionally deterministic function of its parents, Z x, y N(2 + x y, 0). We can calculate the marginal distribution of Z by passing messages in the join tree shown in Figure 7. The PDF s for X and Y, denoted by f X and f Y, respectively, are combined to form the joint PDF for {X, Y } and sent to {X, Y, Z} in the join tree. We next calculate the PDF for Z = X Y + 2 using Proposition 4 as follows: f Z (z) = f X,Y (x, x z +2)dx. Note that in this case, since X and Y are independent, f X,Y (x, y)=f X (x)f Y (y). 12

13 Fig. 8. The marginal PDF for Z in Example 5. Thus, if f X and f Y are maintained as decomposed potentials in the message from {X, Y } to {X, Y, Z} the calculation above can be simplified to f Z (z) = f X (x)f Y (x z +2)dx. The marginal PDF for Z (shown in Figure 8) was created by approximating the normal PDF s in this example with the MTE approximation to the normal PDF presented in Cobb and Shenoy [1]. The expected value and variance of this marginal PDF are and These answers are comparable with exact results obtained using Hugin software, which gives an expected value and variance of and , respectively. Suppose we obtain evidence that Z = 3 and pass this evidence as a message from {Z} to {X, Y, Z}. Since the existing potential for Z states that Z =2+X Y, the evidence dictates the new deterministic relationship X = Y + 1, which is expressed as the CMF p X y and sent from {X, Y, Z} to {X, Y } in the join tree. The variables X and Y are no longer independent and now have a linear conditionally deterministic relationship. The revised Bayesian network is depicted in Figure 9. To calculate the revised marginal distribution for X we combine f X (x) with the distribution created by applying the linear CDF marginalization operator in (6) to the prior distribution for Y (the latter is the message from {X, Y } to {X}) as follows: f Xev (x) =K f X (x) (f Y p X y ) Y (x) =K f X (x) f Y (x 1). In this calculation, K is a normalization constant. The expected value and variance of the posterior marginal PDF for X are calculated as and , respectively. These answers are comparable with exact results obtained using Hugin software, which gives an expected value and variance of and , respectively. To calculate the revised marginal distribution for Y we combine the prior distribution for Y with the distribution created by applying the linear CDF 13

14 X Y Z Fig. 9. The revised Bayesian network for Example 5 after observing evidence on Z. marginalization operator in (6) to the prior distribution for X (the latter is the message from {X, Y } to {Y }) as follows: f Yev (y) =K f Y (y) (f X p Y x ) X (y) =K f Y (y) f X (y +1). In this calculation, K is a normalization constant. Propositions 1 and 2 allow us to use either p X y or p Y x as equivalent expressions of the deterministic relationship between Y and X. The expected value and variance of the posterior marginal PDF for Y are calculated as and , respectively. These answers are comparable with exact results obtained using Hugin software, which gives an expected value and variance of and , respectively. 5 Example The Bayesian network in this example (shown in Figure 10) contains one variable (A) which follows a beta distribution, one variable (C) with a Gaussian potential, and one variable (B) which is a linear conditionally deterministic function of its parent. All probability potentials are approximated in the calculations by MTE potentials. 5.1 Representation The probability distribution for A is a beta distribution with parameters α = 2.7 andβ =1.3, i.e. (A) Beta(2.7, 1.3). The PDF for A is approximated 14

15 A B C Fig. 10. The Bayesian network for the example problem. (using the methods described in [4]) by an MTE potential as follows: α(a) =P (A) = exp{ a} exp{ a} if 0 <a<d exp{ a} exp{ a} if d a<m (5.26E 12) exp{ a} exp{ a} if m a<1 0 elsewhere. where m =(1 α)/(2 α β) =0.85 and d = (α 1)(α+β 3) (β 1)(α 1)(α+β 3) (α+β 3)(α+β 2) = The MTE potential for A is shown graphically in Figure 11, overlayed on the actual Beta(2.7, 1.3) distribution. The probability distribution for B is defined as (B a) N(2a +1, 0). The conditional distribution for B is represented by a CMF as follows: β(a, b)=p B a (a, b) =1{b =2a +1}(a, b). The probability distribution for C is defined as (C b) N(2b +1, 1). This distribution is modeled with the MTE approximation to the normal PDF (denoted by δ) from[1]. 15

16 Fig. 11. The MTE potential for A overlayed on the actual Beta(2.7, 1.3) distribution. α β δ A {A,B} B {B,C} C Fig. 12. The join tree for the example problem. 5.2 Computing Messages e C The join tree for the example problem is shown in Figure 12. The messages required to calculate prior marginals for each variable in the network without evidence are as follows: 1) α from {A} to {A, B} 2) (α β) A from {A, B} to {B} and {B} to {B,C} 3) ((α β) A δ) B from {B,C} to {C} 5.3 Prior Marginals The prior marginal distribution for B is the message sent from {A, B} to {B,C}. The expected value and variance of this distribution are calculated as and , respectively. The prior marginal distribution for C is the message sent from {B,C} to {C}. The expected value and variance of 16

17 Fig. 13. The prior marginal distributions for B (left) and C (right) Fig. 14. The posterior marginal distribution for B considering the evidence C =6. this distribution are calculated as and , respectively. The prior marginal distributions for B and C are shown graphically in Figure Entering Evidence Assume evidence exists that C = 6 and define e C = 6. Define η =(α β) A and ϑ(a, b) =p A b (a, b) =1{a =0.5b 0.5}(a, b) as the potentials resulting from the reversal of the arc between A and B. The evidence e C = 6 is passed from {C} to {B,C} in the join tree, where the existing potential is restricted to δ(b, 6). This likelihood potential is passed from {B,C} to {B} in the join tree. Denote the unnormalized posterior marginal distribution for B as ξ (b) =η(b) δ(b, 6). The normalization constant is calculated as K = b (η(b) δ(b, 6)) db = Thus, the normalized marginal distribution for B is found as ξ(b) = K 1 ξ (b). The expected value and variance of this distribution (which is displayed in Figure 14) are calculated as and , respectively. 17

18 Fig. 15. The posterior marginal distribution for A considering the evidence (c =6). Using the results of Proposition 1, we determine the posterior marginal distribution for A. Define θ =(ξ ν) B as: θ(a) = 1 ξ (2a +1). 0.5 The CMF ν(a, b) =p B a (a, b) =1{b =2a +1}(a, b) is obtained by reversing the arc between A and B in Figure 10. The expected value and variance of this distribution are calculated as and , respectively. The posterior marginal distribution for A considering the evidence is shown graphically in Figure Summary and Conclusions This paper has described operations required for inference in continuous Bayesian networks containing variables that are linear conditionally deterministic functions of their parents. Since the joint PDF for a network with deterministic variables does not exist, the operations presented are derived from the method of convolutions in probability theory. Similar operations to those presented in this paper are incorporated in an inference algorithm for hybrid Bayesian networks (containing discrete and continuous variables) in [3]. This algorithm requires a mixed potential representation to accommodate mixed distributions. Nonlinear deterministic relationships can be accommodated in continuous Bayesian networks by extending the operations in this paper to piecewise linear functions which approximate nonlinear functions, as shown in [2]. 18

19 Acknowledgments This research was completed while the first author was a doctoral student at the University of Kansas and was supported by a research assistantship from the School of Business. We thank Thomas D. Nielsen, Antonio Salmerón, participants of the PGM 2004 conference, and two anonymous referees for valuable comments, suggestions, and discussions. References [1] B.R. Cobb, P.P. Shenoy, Inference in hybrid Bayesian networks with mixtures of truncated exponentials, Int. J. Approx. Reason. (2005) in press. Available electronically at <doi: /j.ijar >. [2] B.R. Cobb, P.P. Shenoy, Nonlinear deterministic relationships in Bayesian networks, in: L. Godo (Ed.), Symbolic and Quantitative Approaches to Reasoning with Uncertainty, Lecture Notes in Artificial Intelligence, Vol. 3571, Springer-Verlag, Heidelberg, 2005, pp [3] B.R. Cobb, P.P. Shenoy, Hybrid Bayesian networks with linear deterministic variables, in: F. Bacchus, T. Jaakkola (Eds.), Uncertainty in Artificial Intelligence, Vol. 21, AUAI Press, Corvallis, OR, 2005, pp [4] B.R. Cobb, R. Rumí, P.P. Shenoy, Approximating probability density functions with mixtures of truncated exponentials, Proceedings of the Tenth Conference on Information Processing and Management of Uncertainty in Knowledge- Based Systems (IPMU-04), 2004, pp Available for download at: pshenoy/papers/ipmu04.pdf [5] R.G. Cowell, A.P. Dawid, S.L. Lauritzen, D.J. Spiegelhalter, Probabilistic Networks and Expert Systems, Springer, New York, [6] R.J. Larsen, M.L. Marx, An Introduction to Mathematical Statistics and its Applications, Prentice Hall, Upper Saddle River, N.J., [7] S.L. Lauritzen, Propagation of probabilities, means and variances in mixed graphical association models, J. Amer. Stat. Assoc. 87 (1992) [8] S.L. Lauritzen, F. Jensen, Stable local computation with conditional Gaussian distributions, Stat. Comput. 11 (2001) [9] S. Moral, R. Rumí, A. Salmerón, Mixtures of truncated exponentials in hybrid Bayesian networks, in: P. Besnard, S. Benferhart (Eds.), Symbolic and Quantitative Approaches to Reasoning under Uncertainty, Lecture Notes in Artificial Intelligence, Vol. 2143, Springer-Verlag, Heidelberg, 2001, pp

20 [10] S. Moral, R. Rumí, A. Salmerón, Estimating mixtures of truncated exponentials from data, in: J.A. Gamez, A. Salmerón (Eds.), Proceedings of the First European Workshop on Probabilistic Graphical Models (PGM 02), Cuenca, Spain, 2002, pp [11] S. Moral, R. Rumí, A. Salmerón, Approximating conditional MTE distributions by means of mixed trees, in: T.D. Nielsen, N.L. Zhang (Eds.), Symbolic and Quantitative Approaches to Reasoning under Uncertainty, Lecture Notes in Artificial Intelligence, Vol. 2711, Springer-Verlag, Heidelberg, 2003, pp [12] R. Rumí, Modelos de Redes Bayesianas con Variables Discretas y Continuas, Doctoral Thesis, Universidad de Almería, Departamento de Estadística y Matemática Aplicada, Almería, Spain,

Inference in Hybrid Bayesian Networks with Mixtures of Truncated Exponentials

In J. Vejnarova (ed.), Proceedings of 6th Workshop on Uncertainty Processing (WUPES-2003), 47--63, VSE-Oeconomica Publishers. Inference in Hybrid Bayesian Networks with Mixtures of Truncated Exponentials