Bayesian Networks to design optimal experiments. Davide De March

Bayesian Networks to design optimal experiments Davide De March davidedemarch@gmail.com 1

Outline evolutionary experimental design in high-dimensional space and costly experimentation the microwell mixture experiments for vesicle formation Bayesian Network models for designing experiments 2

The experimentation how can we generate vesicles with a mixture of amphiphilic molecules? 3

The biochemical technology microwell plates one experiment in each well (mixture exp.) a large set of molecules process variables (defined a priori by the experimentalists) 4

The mixture experiment each well contains a mixture of 5 molecules, selected from an initial set of 16 X= {X 1,,X 16 } the constraints in these experimentation are: 0 X i 1 16 i=1 X i =1 the system response is the Turbidity measured by a fluorescent microscope 5

The optimal design find the best subset of variables that leads to a maximum system response, defining: which factors are relevant for the response their levels their interactions 6

classical approaches (simplex lattice design, simplex fractional design, screening) cannot deal with high-dimensional problems. in our case the simplex lattice design defines 15504 points to be tested (too many and too costly) 7

The design as a dynamical process the iterative process in designing experiments allows to reach an improved or optimal system response DESIGN EXPERIMENT MODELING The use of statistical models helps experimentalists to select the most informative variables, the significant interactions and to define 8 a new experimental design

Experimental design by model we model the experimental data we identify the most significant variables and the most relevant interactions for the system response we use the Bayesian network to define an optimal experimental design 9

Bayesian Network (BN) BNs are probabilistic graphical models that specify the multivariate probability distributions on a set of variables. They are described by a triple {V,A,Θ} where \ 10

{V,A} is the topological part of the BN and represents the structure of the network. V is the set of nodes in the structure (each node is a variable) and A is the set of arches between nodes and describes the dependence between nodes (no arc between 2 nodes means independence) Θ is the quantitative part of the network, and represents the conditional probability distribution (BN's parameters) 11

Connections in the BN all the links of a BN can be summarised in three types of connections: Serial connection A B C in this connection A & C are conditionally independent if we have knowledge about the state of B 12

Diverging connection: A B C... E without knowledge about the state of A, BC&E are dependent. B,C&E are conditionally independent given the knowledge of the state of A (A block the information if it is observed) 13

Converging connection: B C... E A B,C&E are independent without knowledge about the state of A (or any of its descendant). Given evidence of the state of A, B,C&E become conditionally dependent 14

Conditional independence & Markov property Given three variables X, Y & Z, X &Y are conditionally independent given Z if: X (Y Z) P( X=x Y=y,Z=z)=P( X=x Z=z) x,y,z the Markov property: X i Nd ( X i ) Pa( X i ) X i X X is conditionally independent of its non descendants (nd) given its parents (Pa). It simplifies the computation of the probability distribution among nodes 15

Example In this network F ( S,X ) ( B,L ) L ( B S ) X (S L ) 16

The joint distribution the joint distribution of the example can be represented as: P (s,b,l,f,x )=P ( s ) P (b s ) P (l b,s ) P ( f b,s,l ) P ( x b,s,l,f ) due to the Markov property we have for x: P (x b,s,l,f ) =P ( x l) consequently the joint distribution can be expressed as: P (s,b,l,f,x )=P ( s ) P (b s ) P (l s) P ( f b,l ) P ( x l) the joint distribution of the network is expressed by: n P( x 1,,x n )= i=1 P ( x i Pa ( x i ) ) 17

Inference in BNs Given some evidences about the state of a variable, you can infer about the state of the other variables in the network the two binary variables case P(B A=a j )= 2 i=1 P( A=a j B ) P ( B) P ( A=a j B=b i ) P ( B=b i ) 18

Inference in a BN is a NP-hard problem (Cooper, 1990) there are many algorithms that make exact or approximate inferences Junction Tree propagation is an efficient algorithm with local procedure for BN inference (Jensen, 2001) 19

Junction Tree Junction Tree is an indirect acyclic graph its nodes are clusters of variables given two clusters, C 1 and C 2, every node on the path between them contains their intersection C 1 C 2 a Separator, S, is associated with each edge and contains the variables in the intersection between neighbouring nodes 20

A A C B A C a) Direct Acyclic Graph (DAG) D D E a) b) E b) Moral Graph with the marriage of parents B D A c) C E BC BCD ABC d) CD CDE c) Triangulated Graph no chordless circle of length 4. d)junction Tree 21

Learning the structure learning the structure means to create the best network for the process three approaches to model structure and parameters: expert knowledge modelling data (Search & Score) a mix of the two previous methods 22

Search & Score algorithms The Search & Score techniques are regarded as efficient approaches to learn the network structure (some examples GA, K2, Hill Climbing ) this approach maximises a score function considering all the possible DAGs (i.e it maximises the fitness of the model to the data) 23

The BN (S&S algorithm) to design the biochemical experiments the final structure: GeNIe Software with search and score (HC) and BIC information 24

Inference in the BN MeanT, is discretised in 6 levels. the aim of the analysis is to understand the system variations and variable relation in the biochemical process 25

Inference on the BN Software for inference - Netica (www.norsys.com) 26

Evidence on MeanT at high-level values X13 drastically increases its presence in the state b other variables decrease their presence (e.g X 15 X 10 ) 27

Results With the dependence/independence structure of the Bayesian network we highlight some important results: X 13 is the most important molecule to define high-values of the MeanT, but only at a certain level (0.2) some interactions, especially involving X 13, help to obtain higher value of the response The identification of the model and of the variable dependencies allows to provide the experimentalist with the next sets of experiment. 28

Thanks 29