Sensitivity Analysis in Bayesian Networks: From Single to Multiple Parameters

Similar documents
A Characterization of Interventional Distributions in Semi-Markovian Causal Models

When do Numbers Really Matter?

An Investigation into Estimating Type B Degrees of Freedom

When do Numbers Really Matter?

1. Tractable and Intractable Computational Problems So far in the course we have seen many problems that have polynomial-time solutions; that is, on

Assignment Fall 2014

Lecture Notes On THEORY OF COMPUTATION MODULE - 2 UNIT - 2

FOUNTAIN codes [3], [4] provide an efficient solution

Chem 4501 Introduction to Thermodynamics, 3 Credits Kinetics, and Statistical Mechanics. Fall Semester Homework Problem Set Number 10 Solutions

Optimal Control of a Heterogeneous Two Server System with Consideration for Power and Performance

Second-Order Wave Equation

Nonlinear parametric optimization using cylindrical algebraic decomposition

VIBRATION MEASUREMENT UNCERTAINTY AND RELIABILITY DIAGNOSTICS RESULTS IN ROTATING SYSTEMS

The Linear Quadratic Regulator

The Dual of the Maximum Likelihood Method

A Note on Johnson, Minkoff and Phillips Algorithm for the Prize-Collecting Steiner Tree Problem

Section 7.4: Integration of Rational Functions by Partial Fractions

FRTN10 Exercise 12. Synthesis by Convex Optimization

Lecture Notes: Finite Element Analysis, J.E. Akin, Rice University

3.4-Miscellaneous Equations

4.2 First-Order Logic

STEP Support Programme. STEP III Hyperbolic Functions: Solutions

We automate the bivariate change-of-variables technique for bivariate continuous random variables with

Modelling by Differential Equations from Properties of Phenomenon to its Investigation

A New Approach to Direct Sequential Simulation that Accounts for the Proportional Effect: Direct Lognormal Simulation

Discontinuous Fluctuation Distribution for Time-Dependent Problems

Stability of Model Predictive Control using Markov Chain Monte Carlo Optimisation

Elements of Coordinate System Transformations

Technical Note. ODiSI-B Sensor Strain Gage Factor Uncertainty

A Model-Free Adaptive Control of Pulsed GTAW

Model Discrimination of Polynomial Systems via Stochastic Inputs

Sources of Non Stationarity in the Semivariogram

Discussion of The Forward Search: Theory and Data Analysis by Anthony C. Atkinson, Marco Riani, and Andrea Ceroli

Optimal search: a practical interpretation of information-driven sensor management

Cubic graphs have bounded slope parameter

Study on the impulsive pressure of tank oscillating by force towards multiple degrees of freedom

Discussion Papers Department of Economics University of Copenhagen

3.1 The Basic Two-Level Model - The Formulas

FEA Solution Procedure

Simplified Identification Scheme for Structures on a Flexible Base

Bayes and Naïve Bayes Classifiers CS434

Convergence analysis of ant colony learning

4 Exact laminar boundary layer solutions

On the tree cover number of a graph

CHANNEL SELECTION WITH RAYLEIGH FADING: A MULTI-ARMED BANDIT FRAMEWORK. Wassim Jouini and Christophe Moy

UNCERTAINTY FOCUSED STRENGTH ANALYSIS MODEL

Department of Industrial Engineering Statistical Quality Control presented by Dr. Eng. Abed Schokry

Math 116 First Midterm October 14, 2009

Parameter adjustment in Bayes networks. The generalized noisy OR-gate

Ted Pedersen. Southern Methodist University. large sample assumptions implicit in traditional goodness

Imprecise Continuous-Time Markov Chains

Lab Manual for Engrd 202, Virtual Torsion Experiment. Aluminum module

A Survey of the Implementation of Numerical Schemes for Linear Advection Equation

Setting The K Value And Polarization Mode Of The Delta Undulator

Path-SGD: Path-Normalized Optimization in Deep Neural Networks

Chapter 3. Preferences and Utility

Robust Tracking and Regulation Control of Uncertain Piecewise Linear Hybrid Systems

Information Source Detection in the SIR Model: A Sample Path Based Approach

Multi-Voltage Floorplan Design with Optimal Voltage Assignment

Sareban: Evaluation of Three Common Algorithms for Structure Active Control

E ect Of Quadrant Bow On Delta Undulator Phase Errors

A Regulator for Continuous Sedimentation in Ideal Clarifier-Thickener Units

BLOOM S TAXONOMY. Following Bloom s Taxonomy to Assess Students

PIPELINE MECHANICAL DAMAGE CHARACTERIZATION BY MULTIPLE MAGNETIZATION LEVEL DECOUPLING

Queueing analysis of service deferrals for load management in power systems

Applying Fuzzy Set Approach into Achieving Quality Improvement for Qualitative Quality Response

Online Budgeted Allocation with General Budgets

Chapter 4 Supervised learning:

1. INTRODUCTION. A solution for the dark matter mystery based on Euclidean relativity. Frédéric LASSIAILLE 2009 Page 1 14/05/2010. Frédéric LASSIAILLE

Reducing Conservatism in Flutterometer Predictions Using Volterra Modeling with Modal Parameter Estimation

EVALUATION OF GROUND STRAIN FROM IN SITU DYNAMIC RESPONSE

BIOSTATISTICAL METHODS

Cosmic Microwave Background Radiation. Carl W. Akerlof April 7, 2013

Creating a Sliding Mode in a Motion Control System by Adopting a Dynamic Defuzzification Strategy in an Adaptive Neuro Fuzzy Inference System

EOQ Problem Well-Posedness: an Alternative Approach Establishing Sufficient Conditions

Universal Scheme for Optimal Search and Stop

The Determination of Uncertainties in Creep Testing to European Standard pren 10291

The Replenishment Policy for an Inventory System with a Fixed Ordering Cost and a Proportional Penalty Cost under Poisson Arrival Demands

Uncertainties of measurement

Effects of Soil Spatial Variability on Bearing Capacity of Shallow Foundations

LOS Component-Based Equal Gain Combining for Ricean Links in Uplink Massive MIMO

Strategic Timing of Content in Online Social Networks

Frequency Estimation, Multiple Stationary Nonsinusoidal Resonances With Trend 1

Reflections on a mismatched transmission line Reflections.doc (4/1/00) Introduction The transmission line equations are given by

Optimization via the Hamilton-Jacobi-Bellman Method: Theory and Applications

Graphs and Networks Lecture 5. PageRank. Lecturer: Daniel A. Spielman September 20, 2007

arxiv: v1 [physics.flu-dyn] 11 Mar 2011

Efficient quadratic penalization through the partial minimization technique

Designing of Virtual Experiments for the Physics Class

Roy Aleksan Centre d Etudes Nucleaires, Saclay, DAPNIA/SPP, F Gif-sur-Yvette, CEDEX, France

Worst-case analysis of the LPT algorithm for single processor scheduling with time restrictions

Artemisa. edigraphic.com. The uncertainty concept and its implications for laboratory medicine. medigraphic. en línea. Reporte breve Metrología

Network Coding for Multiple Unicasts: An Approach based on Linear Optimization

Mechanisms and topology determination of complex chemical and biological network systems from first-passage theoretical approach

A sixth-order dual preserving algorithm for the Camassa-Holm equation

ρ u = u. (1) w z will become certain time, and at a certain point in space, the value of

THE REDUCTION IN FINESTRUCTURE CONTAMINATION OF INTERNAL WAVE ESTIMATES FROM A TOWED THERMISTOR CHAIN

FEA Solution Procedure

LambdaMF: Learning Nonsmooth Ranking Functions in Matrix Factorization Using Lambda

Lecture: Corporate Income Tax

Transcription:

Sensitivity Analysis in Bayesian Networks: From Single to Mltiple Parameters Hei Chan and Adnan Darwiche Compter Science Department University of California, Los Angeles Los Angeles, CA 90095 {hei,darwiche}@cs.cla.ed Abstract Previos work on sensitivity analysis in Bayesian networks has focsed on single parameters, where the goal is to nderstand the sensitivity of qeries to single parameter changes, and to identify single parameter changes that wold enforce a certain qery constraint. In this paper, we expand the work to mltiple parameters which may be in the CPT of a single variable, or the CPTs of mltiple variables. Not only do we identify the soltion space of mltiple parameter changes that wold be needed to enforce a qery constraint, bt we also show how to find the optimal soltion, that is, the one which distrbs the crrent probability distribtion the least (with respect to a specific measre of distrbance). We characterize the comptational complexity of or new techniqes and discss their applications to developing and debgging Bayesian networks, and to the problem of reasoning abot the vale (reliability) of new information. 1 Introdction Sensitivity analysis in Bayesian networks [13, 9] is broadly concerned with nderstanding the relationship between local network parameters and global conclsions drawn based on the network [12, 2, 8, 11]. This nderstanding can be sefl in a nmber of areas, inclding model debgging and system design. In model debgging, the ser may wish to identify parameters that are relevant to certain qeries, or to identify parameter changes that wold be necessary to enforce certain sanity checks on the vales of probabilistic qeries. In system design, sensitivity analysis can be sed to choose false positive and false negative rates for sensors and tests to ensre the qality of an information system based on Bayesian networks. One technical formalization of sensitivity analysis is as follows. Given a Bayesian network, and a sbset of network parameters, we wold like to identify possible changes to these parameters that can ensre the satisfaction of a qery constraint, sch as Pr(z e) p, for some event z and evidence e. Other possible qery constraints inclde Pr(z 1 e)/pr(z 2 e) k and Pr(z 1 e) Pr(z 2 e) k for events z 1 and z 2. Figre 1 depicts an example of a sensitivity analysis session sing SamIam [1]. Here, the network portrays an information system for predicting pregnancy based on the reslts of three tests. The crrent evidence indicates that the blood test is positive while the rine test is negative, and the probability of pregnancy given the test reslts is 90%. Sppose, however, that we wish the test reslts to confirm pregnancy to no less than 95%. Sensitivity analysis can be sed in this case to identify necessary parameter changes to enforce this constraint, which can translate to changes in the false positive and false negative rates of varios tests (which can be implemented by obtaining more reliable tests). A key aspect of sensitivity analysis is the nmber of considered parameters. The simplest case involves one parameter at a time, i.e., we are only allowed to change a single parameter in the network to ensre or qery constraint. Previos work has provided a procedre to find these single-parameter changes [4], sing the fact that any joint probability is a linear fnction of any network parameter. Specifically, given a parameter θ x, we can solve for the possible vales of θ x that can ensre a given constraint. The time complexity needed to identify sch changes for all network parameters is the same as performing inference sing classical algorithms sch as jointree. In or previos example, we can solve for the soltion for each parameter sing SamIam where the parameter changes are displayed in Figre 1. For example, one of the sggestions is to change the false positive of the blood test from 10% to no more than 5%. Obviosly, there are

Figre 1: Finding parameter changes sing the sensitivity analysis tool in SamIam. many parameters that are irrelevant to the qery in this case. Single parameter changes are easy to visalize and compte, bt they are only a sbset of possible parameter changes. We may generally be interested in changing mltiple parameters in the network simltaneosly to ensre the qery constraint. To facilitate this, we need to nderstand the interaction between any joint probability and any set of network parameters [5, 6]. One common case involves changing mltiple parameters bt within the same conditional probability table (CPT) of some variable. The first contribtion of this paper is that of showing how to identify sch changes, with little extra comptation beyond that needed for single parameter changes. This is significant since mltiple parameter changes can be more meaningfl, and may distrb the probability distribtion less significantly than single parameter changes. Practically speaking, this new techniqe allows s to change both the false positive and false negative rates of a certain information sorce, which can allow the enforcement of certain constraints that cannot be enforced by only changing either the false positive or the false negative rate. Or second contribtion involves techniqes for finding parameter changes that involve mltiple CPTs. However, as we will show, the complexity increases linearly in the size of each additional CPT that is involved. Therefore, practically, we can only compte sggestions of parameter changes involving a small sbset of CPTs. As expected, the soltion space for mltiple parameters will be a region in the k-dimensional space, where k is the nmber of involved parameters. For example, for the case where we change two parameters in the same CPT, the soltion space will be a half plane, in the form of α 1 θ 1 + α 2 θ 2 c. These reslts are difficlt to visalize and present to sers. Hence, we may want to identify and report a particlar point in the soltion space, i.e., a specific amont of change in θ 1 and θ 2. Now, the key qestion becomes: Which point in the soltion space shold we report? The approach we shall adopt is to report the point which minimizes model distrbance. Bt this brings another qestion: How to measre and qantify model distrbance? To address this qestion, we will qantify the distrbance to a model by measring the distance between the original distribtion pr and the new one Pr (after the parameters have been changed) sing a specific distance measre [3] for reasons we will discss later. A third contribtion in this paper relates to the application of or reslts to the problem of reasoning abot ncertain evidence. Specifically, we show how or reslts allow s to identify the weakest ncertain evidence, and on what network variables, that is needed to confirm a given hypothesis to some degree.

2 Sensitivity Analysis: Single CPT We will present soltions to two key problems in this section. First, given a Bayesian network that specifies a distribtion pr, and a variable X with parents U, we want to identify all changes to parameters θ x in the CPT of X which wold enforce the constraint Pr(z e) p. Here, Z and E are arbitrary variables in the network, pr is the distribtion before we apply parameter changes, and Pr is the distribtion after the change. Second, among the identified changes, we want to select those that minimize the distance between the old distribtion pr and new one Pr according to the following distance measre [3]: D(Pr, pr) def pr(ω) pr(ω) = ln max ln min ω Pr(ω) ω Pr(ω). (1) This measre allows one to bond the amont of change in the vale of any qery β 1 β 2, from pr to Pr, as follows: pr(β 1 β 2 )e d pr(β 1 β 2 )(e d 1) + 1 Pr(β 1 β 2 ) pr(β 1 β 2 )e d pr(β 1 β 2 )(e d 1) + 1, (2) where d = D(Pr, pr). Hence, by minimizing this distance measre, we are able to provide tighter bonds on global belief changes cased by local parameter changes. One obvios side-effect of changing the parameter θ x is that parameters θ x, for all x x, mst also be changed sch that the sm of all these parameters remain 1. Therefore, if X is binary, the parameter θ x mst be changed by an eqal bt opposite amont. If X is mlti-valed, we can assme a proportional scheme of co-varying the other parameters, sch that the ratio between them remain the same. However, sometimes certain parameters shold remain nchanged, sch as parameters who are assigned 0 vales [15]. This capability is provided in SamIam, where sers can lock certain parameters from being changed dring sensitivity analysis by checking a flag. In this paper, for simplicity of presentation, we will assme that X is binary with two vales x and x, where we can obtain similar (bt more wordy) reslts for X being mlti-valed [4]. 2.1 Identifying sfficient parameter changes We first note that the joint probability Pr(e) can be expressed in terms of the parameters in the CPT of X: Pr(e) = C + C θ x, where C is a constant, and: C = θ x. As a reminder, Pr(e) is linear in θ x, and hence, / θ x is a constant, independent of the vale of θ x. Moreover, 2 Pr(e)/ θ x θ x = 0, for any parent instantiations and [6]. Therefore, if we apply a change of θ x to each θ x, we have: Pr(e) = Pr(e) pr(e) = = C θ x θ x θ x. (3) Now, to find the soltion of parameter changes that satisfies Pr(z e) p (or eqivalently, Pr(z, e) p Pr(e)), the following mst hold: Pr(z, e) + pr(z, e) p( Pr(e) + pr(e)). From Eqation 3, we have: Pr(z, e) θ x + pr(z, e) θ x ( ) p θ x + pr(e). θ x Rearranging the terms, we get: α(θ x ) θ x (pr(z, e) p pr(e)), (4) where: α(θ x ) = Pr(z, e) θ x p θ x. (5) The first problem addressed in this section can then be solved by finding possible combinations of θ x that satisfy Ineqality 4. The soltion space can be fond by solving for the eqality condition, and it will be in the shape of a half space de to the linearity of or terms. To find the soltion space of Ineqality 4, we need to compte all partial derivatives of the form Pr(z, e)/ θ x and / θ x. They can be compted sing the jointree algorithm [14, 10] or the differential approach [6]. The time complexity of this comptation is O(n exp(w)), where w is the network treewidth, and n is the nmber of network parameters [6]. This complexity is the same as that of compting the probability of evidence Pr(e). Moreover, the above method generalizes a previos method [4] for identifying single parameter changes, yet has the same complexity!

-0.1-0.08-0.06-0.04-0.02 0.6 0.4 0.2-0.2 θ fn θ fp -0.1-0.08-0.06-0.04-0.02 0.6 0.4 0.2-0.2 Figre 2: Finding single CPT changes for Example 2.1. On the left, we plot the soltion space in terms of θ fp and θ fn, which is the region below the line. On the right, we illstrate how we find the optimal soltion, by moving on the crve where the log-odds change in the two parameters are the same, as the optimal soltion is the intersection of the line and the crve. Example 2.1 Given the sensitivity analysis problem shown in Figre 1, we are now interested in changing mltiple parameters in a single CPT to satisfy the constraint. For example, we may want to se a more reliable blood test to satisfy or desired constraint. Crrently, the false positive of the test is 10%, while the false negative is 30%. We will denote these two parameters as θ fp and θ fn respectively. We can find the α terms for both parameters given by Eqation 5, and plg into Ineqality 4:.1061 θ fp.0076 θ fn.0053. The soltion space is plotted on the left of Figre 2. The line indicates the set of points where the eqality condition holds, while the soltion space is the region below the line. Therefore, any parameter changes in this region will be able to ensre the constraint that the probability of pregnancy given the test reslts is at least 95%. θ fn 2.2 Identifying optimal parameter changes We now address the second problem of interest in this section: identifying the soltion of Ineqality 4 that minimizes the distance between the original and new distribtion. This soltion is niqe and can be identified sing a simple local search procedre. Bt the techniqe is based on several observations. First, since the new distribtion Pr is obtained from pr by changing only one CPT, the distance between pr and Pr is exactly the distance between the old and new CPT (each viewed as a distribtion) [3]. θ fp Second, there is a closed form for this distance: 1 D X U = max log θ x + θ x 1 (θ x + θ x ) log θ x 1 θ. x (6) Note that the qantity being maximized above is nothing bt the absolte log odds change for parameter θ x, i.e., (log O(θ x )). Third, we mst be able to find an optimal soltion on the line where Pr(z e) = p, since if there is an optimal soltion where Pr(z e) > p, we can always decrease the absolte log odds change in some parameter to satisfy the eqality condition, and the distance measre will not increase. Finally, for any soltion that satisfies Pr(z e) = p, it follows from Eqation 6 that the soltion which minimizes the distance D X U is the one where the absolte log odds changes in all parameters in the CPT is the same. This is becase to obtain another soltion on the line, we mst increase the absolte log odds change in one parameter and decrease it in another, thereby prodcing a larger distance measre. Given the above observations, we can now search for the optimal single CPT parameter change that satisfies Ineqality 4 sing the following local search procedre: 1. Pick all parameters θ x in the CPT of X where the terms α(θ x ) are non zero, and categorize them according to whether the term is positive or negative. 2. Pick a certain amont of absolte log odds change (log O(θ x )), and apply it to each parameter. Whether a parameter is increased or decreased depends on its α term. 3. If Pr(z e) = p within an acceptable degree of error, we have fond the optimal soltion. Otherwise, try a larger (log O(θ x )) if Pr(z e) < p, or a smaller (log O(θ x )) if Pr(z e) > p. The new amont of (log O(θ x )) applied shold be determined nmerically by the new qery vale of Pr(z e) for a fast rate of convergence. On the right of Figre 2, we provide an illstration of or procedre applied to Example 2.1. There are two parameters in the CPT we are allowed to change, the false positive and the false negative rates. The region below the line is the soltion space. The points on the new crve are those where the log odds changes in the two parameters are the same (both parameters 1 Eqation 6 assmes Pr() > 0 for all instantiations. If Pr() = 0 for some, any qery vale will not respond to any change in the parameter θ x, so we can leave it ot when compting the distance measre.

are decreased becase their α terms are negative). To find the optimal soltion, we only need to move on this crve sing a nmerical method, ntil we are at the intersection of the line and the crve. It is given by θ fp =.042 and θ fn =.109, i.e., the new false positive shold be 5.8% and the new false negative shold be 19.1%. The above techniqe has been implemented in SamIam which is available for download [1]. 3 Single Parameter Changes vs. Single CPT Changes In this section, we make comparisons between single parameter changes and single CPT changes. As we have shown, both types of sggestions reqire the same amont of comptations to find, in compting the partial derivatives of joint probabilities with respect to all parameters. However, soltions of single CPT changes are harder to visalize and present, and it takes a little more time to find the optimal soltion sing the nmerical method we proposed. However, it is advantageos to apply single CPT changes instead of single parameter changes to a Bayesian network in order to satisfy a qery constraint. First, single CPT changes are more meaningfl and intitive than single parameter changes. For example, given a sensor in a network, single parameter changes amont to changing only the false positive or false negative rate of this sensor, while single CPT changes allow one to change both rates. Second, for some variable in the network, there may exist single CPT changes, bt not single parameter changes, that can ensre a certain qery constraint. For an example consider Figre 4, which depicts a Bayesian network involving a scenario of potential fire in a bilding. Crrently, we are given evidence of smoke and people leaving the bilding, and the probability that there is a tampering of the alarm given the evidence is 2.87%. We may now pose the qestion: what parameter changes can we apply to decrease this vale to at most 1%? If we can only change a single parameter in the network, SamIam retrns a simple answer: the only parameter yo can change is the prior probability of tampering, from 2% to.7%. Yo cannot change any single parameter in the CPT of the Alarm variable (representing whether the alarm is triggered, by fire, tampering or other sorces) to ensre the constraint, and we may be inclined to believe that the parameters in this CPT are irrelevant to the qery. However, if we are allowed to change mltiple parameters in a single CPT, SamIam retrns a new sggestion, telling s that we Bonds on q 1 0.8 0.6 0.4 0.2 0.2 0.4 0.6 0.8 1 p Bonds on q 1 0.8 0.6 0.4 0.2 0.2 0.4 0.6 0.8 1 p Figre 3: The plot of the bonds on the new vale of any qery, q = Pr(β 1 β 2 ), in terms of its original vale p = pr(β 1 β 2 ), for the sggested single parameter change, with d =.995, and the sggested single CPT change, with d =.445. can indeed change the CPT of the Alarm variable to ensre or constraint. The optimal sggestion compted by SamIam is shown in Figre 4, where the original parameter vales are in white backgrond, and the sggested parameter vales are in shaded backgrond. The distance measre of this parameter change is 2.29. Finally, even if changes of both types are available, single CPT changes are often preferred becase they distrb the network less significantly, as they incr a smaller distance measre. For example, we can pose another qery constraint, where we want to decrease the qery vale from 2.87% to at most 2.5%. This time, for the CPT of the Alarm variable, SamIam retrns parameter change sggestions of both types. A possible single parameter change is to decrease the probability of the alarm triggered given tampering bt no fire from 85% to 67.7%, incrring a distance measre of.995. On the other hand, if we change all parameters in the CPT simltaneosly, the distance measre incrred is a mch smaller vale of.445. From Ineqality 2, the distance measre compted for a parameter change qantifies the distrbance to the original probability distribtion, by providing bonds on changes in any qery β 1 β 2. In Figre 3, we plot the bonds on the new vale of any qery, q = Pr(β 1 β 2 ), in terms of its original vale, p = pr(β 1 β 2 ), for the respective vales of the distance incrred by the sggested single parameter change and the sggested single CPT change respectively. As we can see, the sggested single CPT change ensres a tighter bond on the change in any qery vale. 4 Sensitivity Analysis: Mltiple CPTs In this section, we allow the changing of parameters in mltiple CPTs simltaneosly. For example, we may want to change all parameters in the CPTs of variables

Figre 4: Finding single CPT changes sing the sensitivity analysis tool of SamIam. X and Y, whose parents are U and V respectively. In this case, the joint probability Pr(e) can be expressed in terms of the parameters in both CPTs as: Pr(e) = C + where C is a constant, and: C θ x + v +,v C,v θ x θ y v, θ x = C + v θ y v = C v + 2 Pr(e) θ x θ y v = C,v. C v θ y v C,v θ y v ; C,v θ x ; Therefore, if we apply a change of θ x to each θ x, and a change of θ y v to each θ y v, the change in the joint probability Pr(e) is given by: ( Pr(e) = + v C + v ( C v + +,v C,v θ x θ y v. C,v θ y v ) θ x C,v θ x ) θ y v = θ x + θ y v θ x θ v y v + 2 Pr(e) θ x θ y v. (7) θ,v x θ y v Now, to find the soltion of parameter changes that satisfies Pr(z e) p, from Eqation 7, we have: Pr(z, e) θ x + θ x v +,v ( p +,v Pr(z, e) θ y v θ y v 2 Pr(z, e) θ x θ y v θ x θ y v + pr(z, e) θ x + θ x v Rearranging terms, we get: θ y v θ y v ) 2 Pr(e) θ x θ y v θ x θ y v + pr(e) α(θ x ) θ x + v α(θ y v ) θ y v +,v α(θ x, θ y v ) θ x θ y v (pr(z, e) p pr(e)), (8).

where α(θ x ) and α(θ y v ) are given by Eqation 5, and: α(θ x, θ y v ) = 2 Pr(z, e) θ x θ y v p 2 Pr(e) θ x θ y v. (9) Therefore, additionally we need to compte the second partial derivatives of Pr(z, e) and Pr(e) with respect to θ x and θ y v for all pairs of and v. A simple way to do this wold be to set evidence on every family instantiation x,, then find the derivatives with respect to θ y v for all v [6]. The complexity of this method is O(nF (X) exp(w)), where F (X) is the nmber of family instantiations of X, i.e., the size of the CPT. This approach is however limited to non extreme vales of θ x, yet it allows one to se any general inference algorithm [6]. For extreme parameters, one can se a specific inference approach [6] to obtain these derivatives sing the same complexity as given above. The comptations above can be expanded to mltiple parameter changes involving more than two CPTs. For example, if we change three CPTs simltaneosly, we need to compte the third partial derivatives with respect to the corresponding parameters. The complexity of obtaining these higher order derivatives is O(n X i F (X i ) exp(w)), where X i are the variables whose CPTs we are interested in [6]. Example 4.1 We again refer to the fire network, and pose another sensitivity analysis problem. Given evidence that people are leaving bt no smoke is observed, the crrent probability of having a fire is 5.2%. We wish to constrain this qery vale to at most 2.5%. SamIam indicates that we can accomplish this by decreasing the prior probability of fire, θ F, from 1% to.47%, or increasing the prior probability of tampering, θ T, from 2% to 4.39%. However, what are the changes necessary if we are allowed to change both parameters? To answer this, we find the α terms given by Eqations 5 and 9, and plg into Ineqality 8:.0845 θ F +.0187 θ T.7816 θ F θ T.000448. The soltion space is plotted on the left of Figre 5. The crve indicates the set of points where the eqality condition holds, while the soltion space is the region above the crve. Therefore, any parameter changes in this region will be able to ensre the constraint that the probability of fire given the evidence is at most 2.5%. We now wish to compte the distance measre for parameter change sggestions involving mltiple CPTs, in order to find the optimal soltion. Althogh this cannot be easily compted in some cases, for the cases where the families X, U and Y, V are disjoint, i.e., X and Y do not have a parent-child relationship and do -0.01-0.005 0.03 θ T 0.02 0.01-0.01-0.02 θ F -0.006-0.003 Distance θ F 0.95 0.9 0.85 0.8 0.75 Figre 5: Finding mltiple CPT changes for Example 4.1. On the left, we plot the soltion space in terms of θ F and θ T, which is the region below the crve. On the right, we illstrate how we find the optimal soltion, by compting the distance measre for each point on the crve in terms of θ F, and locating the minimm. not have a common parent, the distance measre can be easily compted as [3]: D {X U,Y V} = D X U + D Y V. (10) Here, the total distance measre can be compted as the sm of the distances cased individally by each of the CPT changes, as compted by Eqation 6. 2 Even thogh we have this restriction of disjointness for Eqation 10, many CPTs satisfy this condition. For example, the two variables, Fire and Tampering, involved in Example 4.1, are both roots, and hence, satisfy or condition. Moreover, when the variables involved are sensors on different variables in a Bayesian network, their families are disjoint, and we can easily compte the distance measre sing Eqation 10. Similar to single CPT changes, we are often more interested in finding the optimal soltion than presenting the whole soltion space. As in the previos case, we can find an optimal soltion on the crve where Pr(z e) = p, and also with the property that the log-odds changes in the parameters of each individal CPT are the same. With these two assmptions, we can find the combination of CPT changes that gives s the smallest distance measre. For example, we can find the optimal soltion for Example 4.1 by traversing on the crve where Pr(z e) = p, and searching for the point with the smallest distance measre. On the right of Figre 5, we plot the 2 If the families X, U and Y, V are not disjoint, the distance measre cannot be compted as the individal sms, becase a pair of instantiations of the two CPTs may not be consistent. In this case, we can still compte the distance measre sing a procedre which mltiplies two tables (thereby eliminating inconsistent pairs of instantiations), a harder bt still manageable process. 0.7

distance measre for points on that crve in terms of θ F. The minimm is attained when θ F =.0039 and θ T =.0056, i.e., the new prior probabilities are.61% and 2.56% respectively. The distance measre given by the optimal soltion is.745. Becase of the comptations involved in finding soltions involving mltiple CPTs, the key to any atomated sensitivity analysis tool which implements this procedre is to find relevant CPTs to check for soltions, instead of trying all combinations of CPTs, which wold be comptationally too costly. The first partial derivatives compted for finding single CPT changes can serve as a gide for identifying these relevant CPTs. For many CPTs, the first partial derivatives with respect to the parameters are 0, eliminating them from consideration. On the other hand, we shold definitely consider CPTs where small parameters changes can indce large changes in the serselected qeries. Hence, the nmerical procedre is not as straightforward as the one for single CPT changes. 5 Searching for the Optimal Soft Evidence One application of sensitivity analysis with mltiple parameters is to find the optimal soft evidence that wold enforce a certain constraint. Soft evidence is formally defined as follows. Given two events of interest, q (virtal event) and r (hypothesis), we specify the soft evidence that q bears on r by the likelihood ratio, λ = Pr(q r)/pr(q r) [13, 7]. The virtal event q serves as soft evidence on r, representing a partial confirmation or denial of r. If λ is more than 1, q arges for r, while if λ is less than 1, q arges against r. If λ eqals 1, q is trivial and does not shed any new information on r. The likelihood ratio λ also qantifies the strength of the soft evidence, with vales closer to infinity or zero indicating more convincing argments. From a Bayesian network perspective, the virtal evidence q can be implemented as a dmmy node Q which is added as a child of the variable R it is reporting on. The likelihood ratio λ will be encoded in the CPT of Q by specifying λ = θ q r /θ q r. The soft evidence is then incorporated by setting the vale of Q to q [13]. For example, in the fire network we showed previosly, we can add a smoke detector to the network, which generates a sond when it detects smoke, bt is not perfect and is associated with small false positive and false negative rates. The triggering of the detector can be viewed as soft evidence on the presence of smoke, as it arges for the presence of smoke. Given a nmber of variables that we can potentially gather soft evidence on, we may be interested in finding the minimm amont of soft evidence to ensre a certain qery constraint. To do that, we mst first add a child Q i to each variable R i of interest, set the CPT of each Q i sch that all parameters are trivial, i.e., 50%, and then observe the evidence q i for every Q i. Doing this will not have any impact on the reslts of any qeries. We then rn the sensitivity analysis procedre on mltiple parameters, and find the optimal soltion of parameter changes, restricted to parameters in the CPTs of variables Q i. This soltion gives s the optimal combination of soft evidence on variables R i, as it minimizes the distance measre, and hence, distrbs the network least significantly. For an example, we go back to the fire network, where we now face a scenario that the alarm is triggered. The probability of having a fire is now 36.67%. We now wish to install a smoke detector, sch that when it is also triggered, the probability of fire is at least 80%. To find the reliability reqired for this detector, we add a detector node as a child of the Smoke variable, while setting all its parameters as trivial. We then add the observation of the detector being triggered as part of evidence, and perform or sensitivity analysis procedre. The reslt sggests that if the false positive and the false negative rates of the detector are both 10.98%, the reliability of the hypothesis is achieved. This is eqivalent to having soft evidence on the presence of smoke with a likelihood ratio of λ = 8.113. 6 Conclsion This paper made contribtions to the problem of sensitivity analysis in Bayesian networks with respect to mltiple parameter changes. Specifically, we presented the technical and practical details involved in identifying mltiple parameter changes that are needed to satisfy qery constraints. The main highlight was the ability to identify optimal, mltiple parameter changes that are restricted to a single CPT, where we showed the complexity of achieving this is similar to the one for single parameter changes, except for an additional cost involved with a simple nmerical method. We also addressed the problem when mltiple CPTs are involved, where we characterized the corresponding soltion and its (higher) complexity. Finally, we discssed a nmber of applications of these reslts, inclding model debgging and information system design. Acknowledgments This work has been partially spported by NSF grant IIS-9988543 and MURI grant N00014-00-1-0617.

References [1] Samiam: Sensitivity analysis, modeling, inference and more. URL: http://reasoning.cs.cla.ed/samiam/. [2] Enriqe Castillo, José Manel Gtiérrez, and Ali S. Hadi. Sensitivity analysis in discrete Bayesian networks. IEEE Transactions on Systems, Man, and Cybernetics, Part A (Systems and Hmans), 27:412 423, 1997. [3] Hei Chan and Adnan Darwiche. A distance measre for bonding probabilistic belief change. In Proceedings of the Eighteenth National Conference on Artificial Intelligence (AAAI), pages 539 545, Menlo Park, California, 2002. AAAI Press. [4] Hei Chan and Adnan Darwiche. When do nmbers really matter? Jornal of Artificial Intelligence Research, 17:265 287, 2002. [5] Veerle M. H. Copé, Finn V. Jensen, Uffe Kjærlff, and Linda C. van der Gaag. A comptational architectre for n-way sensitivity analysis of Bayesian networks. Technical report, 2000. [6] Adnan Darwiche. A differential approach to inference in Bayesian networks. Jornal of the ACM, 50:280 305, 2003. [7] Joseph Y. Halpern and Riccardo Pcella. A logic for reasoning abot evidence. In Proceedings of the Nineteenth Conference on Uncertainty in Artificial Intelligence (UAI), pages 297 304, San Francisco, California, 2003. Morgan Kafmann Pblishers. [8] Finn Verner Jensen. Gradient descent training of Bayesian networks. In Proceedings of the Fifth Eropean Conference on Symbolic and Qantitative Approaches to Reasoning with Uncertainty (ECSQARU), pages 190 200, Berlin, Germany, 1999. Springer-Verlag. [9] Finn Verner Jensen. Bayesian Networks and Decision Graphs. Springer-Verlag, New York, 2001. [10] Finn Verner Jensen, Steffen L. Laritzen, and Kristian G. Olesen. Bayesian pdating in casal probabilistic networks by local comptations. Comptational Statistics Qarterly, 5:269 282, 1990. [11] Uffe Kjærlff and Linda C. van der Gaag. Making sensitivity analysis comptationally efficient. In Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence (UAI), pages 317 325, San Francisco, California, 2000. Morgan Kafmann Pblishers. [12] Kathryn Blackmond Laskey. Sensitivity analysis for probability assessments in Bayesian networks. IEEE Transactions on Systems, Man, and Cybernetics, 25:901 909, 1995. [13] Jdea Pearl. Probabilistic Reasoning in Intelligent Systems: Networks of Plasible Inference. Morgan Kafmann Pblishers, San Mateo, California, 1988. [14] Prakash P. Shenoy and Glenn Shafer. Propagating belief fnctions with local comptations. IEEE Expert, 1:43 52, 1986. [15] Haiqin Wang and Marek J. Drzdzel. User interface tools for navigation in conditional probability tables and elicitation of probabilities in Bayesian networks. In Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence (UAI), pages 617 625, San Francisco, California, 2000. Morgan Kafmann Pblishers.