Support for Multiple Cause Diagnosis with Bayesian Networks Presentation Friday 4 October Randy Jagt Decision Systems Laboratory Intelligent Systems Program University of Pittsburgh
Overview Introduction Objectives Research Implementation GeNIe DIAG Tests & Results Conclusions
Diagnosis The process of determining the cause or malfunction by means of collecting information Introduction
Diagnostic Expert Systems Primary tasks: Determine the most probable cause Determine which information to gather Applications Medicine Troubleshooting Introduction
Bayesian Netwerk Visit To Asia? X-Ray Result Diseases Smoking? Dyspnea? Random variables Probabilistic relations Conditional probability table Marginal probability distribution Reasoning algorithms Introduction
Diagnostis Proces with BN Test Rankings Smoking? 0.26 X-Ray Result 0.26 Dyspnea 0.23 Visit To Asia? <0.01 Introduction
Suppose Multiple Causes Visit To Asia? Smoking? TC= Tuberculosis LC= Tuberculosis? (TC) Lung Cancer? (LC) Bronchitis? (BC) LungCancer BC= Bronchitis Tuberculosis or Lung Cancer? PN= Pneomia X-Ray Result Dyspnea? Introduction
Problem Statement Diagnosing multiple causes leads to the following problems: Presenting the combinations: User has to keep track on the change of exponential amount of combinations Calculational effort: To determine the test rankings, the joint probability distribution over all combinations has to be calculated Introduction
Objectives Investigation of Value of Information Development of approaches which: Handle the computational complexity Allow the user to work with multiple causes Implementation of these approaches Testing these approaches Objectives
Value of Information Value functions Functions that assign a value to a distribution Applied without a test V(P(H)) with a test V(P(H T)) VOI process Joint prob P(H) V(P(H)) For every teststate t V(P(H t)) EV(T) EB(T) TS(T)
Value Function Entropy (Shannon) ( P( H )) = P( h) log ( P( h) ) VENT 2 h H Minimum at uniform probability Monotonic decreasing function of the number of entries Research
Value of Information(2) Expected Value: Expected Benefit: Test Strength: t T ( P( H t) ) P( t) EV ( T ) = V EB( T ) = EV TS = ( ) ( P( H )) EB T V ( T ) V ( P( H )) VOI process Joint prob P(H) V(P(H)) For every teststate t V(P(H t)) EV(T) EB(T) TS(T)
Marginal Probability Approach Create new value functions that work with the marginal probability distribution Research
Marginal & Joint Relation Lower bound: P( a, b) max 0; P( a) + P( b) 1 { } Upper bound: P( a, b) min P( a), P( b) ( ) High joint probability All high marginal probabilities Low joint probability At least one low marginal probability Research
Marginal VOI Process Marginal Strength Function V P( M )) = ( P( h) 0.5) h M MS ( 2 1 2 n Minimum at probability 0.5 2 1 n VOI process P(M)= P(h 1 ),...,P(h n ) V MS (P(M)) For every teststate t V MS (P(M t)) EV(T) EB(T) TS(T)
Marginal Strength Research
Joint Probability Approach Use of a marginal based function or Copula function to calculate the joint probability distribution Research
Joint VOI Process Product Copula P Entropy ( h,..., ) = ( ) 1 hn P h i n i= 1 ( P( H )) = P( h) log ( P( h) ) VENT 2 h H VOI process P(h 1 ),...,P(h n ) Copula V(P(H)) For every teststate t Copula V(P(H t)) EV(T) EB(T) TS(T)
Presentational Aspect Denote interesting states as targets Tuberculosis? Absent Present LungCancer? Absent Present Bronchitis? Absent Present Let the user choose targets to pursue Target states TC_present LC_present BC_present Research
Diagnosis with Targets Marginal probability approach over all pursued targets Joint probability approach over all the combinations with at least one pursued target and the rest TC_absent LC_absent BC_absent TC_present LC_absent BC_absent TC_absent LC_absent BC_present TC_present LC_absent BC_present TC_absent TC_present TC_absent TC_present LC_present BC_absent LC_present BC_absent LC_present BC_present LC_present BC_present Research
Implementation SMILE (Structural Modeling, Inference, and Learning Engine) GeNIe (Graphical Network Interface) Visual C++ GeNIe DIAG Implementation
GeNIe DIAG GeNIe DIAG
Time Test Time test -> Time necessary for the VOI process Asia network Hepar II network -> Diagnosis of liver disorders -> 81 variables, 9 hypothesis variables Pitt network -> 181 variables, 47 hypothesis variables Tests & Results
Time Test Results Time in Seconds Marginal Probabiltiy Approach Joint Probability Approach True Joint Probability Asia network 3 targets 0 0 0 Hepar II network 10 targets 0 1 514 Pitt network 47 targets 0 >60 min >60 min Tests & Results
Quality Test & ROC-Analysis ROC-curve Sensitivity = likelihood that a present cause was correctly diagnosed Specifity = likelihood that an absence of cause was correctly diagnosed Hepar II network Tests & Results
Quality Test Results Tests & Results
Conclusions Development of two approaches Implemented in GeNIe & SMILE Ability to direct the diagnostic process Tests showed that Marginal probability approach=fast but less qaulitative Joint probability approach= slow but good quality Conclusions
Future Research Other copula functions Smart algorithm for calculating the true joint probability distribution Expansion of the multiple cause support to multiple test ranking Conclusions
Questions?