Charles University in Prague Faculty of Mathematics and Physics Department of Probability and Mathematical Statistics. Abstract of Doctoral Thesis

Size: px
Start display at page:

Download "Charles University in Prague Faculty of Mathematics and Physics Department of Probability and Mathematical Statistics. Abstract of Doctoral Thesis"

Transcription

1 Charles University in Prague Faculty of Mathematics and Physics Department of Probability and Mathematical Statistics Abstract of Doctoral Thesis INDEPENDENCE MODELS Mgr. Petr Šimeček Supervisor: RNDr. Milan Studený, DrSc., ÚTIA AV ČR Study program: mathematics Study branch: m4, probability and mathematical statistics Prague 2007

2 Univerzita Karlova v Praze Matematicko-fyzikální fakulta Katedra pravděpodobnosti a matematické statistiky Autoreferát dizertační práce NEZÁVISLOSTNÍ MODELY Mgr. Petr Šimeček Školitel: RNDr. Milan Studený, DrSc., ÚTIA AV ČR Studijní program: matematika Studijní obor: m4, pravděpodobnost a matematická statistika Praha 2007

3 Tato dizertační práce byla vypracována v rámci doktorského studia, které uchazeč absolvoval na Matematicko-fyzikální fakultě Univerzity Karlovy v letech Uchazeč: Školitel: Oponenti: Mgr. Petr Šimeček RNDr. Milan Studený, DrSc. Ústav teorie informace a automatizace Akademie věd České republiky Pod Vodárenskou věží 4, Praha 8 Doc. RNDr. Petr Lachout, CSc. Katedra pravděpodobnosti a matematické statistiky Matematicko-fyzikální fakulta Univerzity Karlovy Sokolovská 83, Praha 8 Ing. František Matúš, CSc. Ústav teorie informace a automatizace Akademie věd České republiky Pod Vodárenskou věží 4, Praha 8 Autoreferát byl rozeslán dne Obhajoba se koná dne v 12:00 před komisí pro obhajoby dizertačních prací v oboru m4 - Pravděpodobnost a matematická statistika na Matematickofyzikální fakultě Univerzity Karlovy, Sokolovská 83, Praha 8, v posluchárně KPMS Praktikum. S dizertací je možno se seznámit na Útvaru doktorandského studia Matematickofyzikální fakulty Univerzity Karlovy, Ke Karlovu 3, Praha 2. Předseda oborové rady: Prof. RNDr. Marie Hušková, DrSc. Katedra pravděpodobnosti a mat. statistiky Matematicko-fyzikální fakulta Univerzity Karlovy Sokolovská 83, Praha 8

4 Contents Contents Introduction Basic Concepts Conditional Independence Gaussian Distributional Framework Discrete Distributional Framework Independence Models Graphical Independence Models No Finite Characterization Exists Representable Models Over N = {1, 2, 3, 4} Gaussian Distributional Framework Discrete Distributional Framework Discrete representability Positive binary representability Positive discrete representability Probability Distributions Over an Independence Model Connection Between Graphical Independence Models and Exponential Families In Regular Distributional Framework Model Search Simulation Study References List of Publications

5 Introduction An essential part of probabilistic reasoning is the theory of graphical models endowing a multivariate probabilistic distribution with the independence structure. An interesting question originally coming from J. Pearl is the problem of probabilistic representability, i.e. for which lists of conditional independence constraints there exists a random vector satisfying these and only these independences. In this work the known results are collected and the problem is further studied in several distributional frameworks. The focus is particularly on a special case of four random variables where the solution is found for a general Gaussian distribution (extending the result of R. Lněnička) and partially for discrete and binary distributions. At the end the achieved results are used to derive the variance matrix estimator in a case of regular Gaussian distribution. This abstract covers briefly the contents of the thesis, omitting proofs and focusing on the most important topics. Lemmas and Theorems is not necessarily in the same order as in the thesis (see references). 1 Basic Concepts For the reader s convenience, auxiliary results related to probability theory and independence models are recalled in this section. The object of our interest will be a random vector ξ = (ξ a ) a N indexed by a finite set N = {1,2,...,n}. Its distribution and density with respect to some product σ finite measure will be denoted by P and f. For A N, a subvector (ξ a ) a A is denoted by ξ A ; ξ is presumed to be a constant. Analogously, P A and f A are a marginal distribution and density; x A is an appropriate coordinate projection of a constant vector x = (x a ) a N. A singleton {a} will be shorten by a and an union of sets A B will be written simply as juxtaposition AB. For a square matrix Σ, let Σ 1, Σ, Σ and rank(σ) denote its inverse, generalized inverse, determinant and rank, i.e. the dimension of the space spanned by its rows (or columns), respectively. 1.1 Conditional Independence Provided A,B,C are pairwise disjoint subsets of N, ξ A ξ B ξ C stands for a statement ξ A and ξ B are conditionally independent given ξ C. In particular, unconditional independence (C = ) is abbreviated as ξ A ξ B. The meaning of conditional independence (CI) statement ξ A ξ B ξ C is as follows: if the value of ξ C is known then knowledge about ξ A is irrelevant (useless for prediction,... ) for knowledge of ξ B and vise versa. There are many equivalent ways to formalize this definition. Let us list a few of them in Lemma 1 (see [2], pp. 29 for a detailed discussion). 2

6 Lemma 1. If A,B and C are pairwise disjoint subset of N then statements i) vi) are equivalent: i) f AB C (x AB x C ) = f A C (x A x C ) f B C (x B x C ) ii) f ABC (x ABC ) f C (x C ) = f AC (x AC ) f BC (x BC ) iii) f A BC (x A x BC ) = f A C (x A x C ) iv) f AC B (x AC x B ) = f A C (x A x C ) f C B (x C x B ) v) f ABC (x ABC ) = f A C (x A x C ) f BC (x BC ) vi) g,h functions : f ABC (x ABC ) = g(x AC ) h(x BC ) where f A B denotes a conditional density f A B (x A x B ) = f AB (x AB )/f B (x B ) which is defined under a condition f B (x B ) > 0. Each of equations above should be understood in a sense that it holds for all x such that all conditional densities (in the equation) are defined. Moreover, if multiinformation 1 (MI) of ξ is finite then an another characterization of CI exists (see [15], pp for a proof): Lemma 2. If ξ is a random vector with finite MI and A,B,C are pairwise disjoint subsets of N then and MI(ξ ABC ) + MI(ξ C ) MI(ξ AC ) MI(ξ BC ) 0 MI(ξ ABC ) + MI(ξ C ) MI(ξ AC ) MI(ξ BC ) = 0 ξ A ξ B ξ C. If ξ does not have a density with respect to the given σ finite measure then generalized conditional independence ξ A ξ B ξ C might be defined as L(ξ AB ξ C = x C ) = L(ξ A ξ C = x C ) L(ξ B ξ C = x C ) P C a.e. i.e. a conditional distribution ξ AB ξ C is a product of corresponding distributions of ξ A and ξ B given ξ C. This generalization will be used just in a case of a Gaussian distribution which variance matrix is not regular. As following Lemma 3 shows any CI statement can be written in a form of elementary CI statements, i.e. as a collection (ξ Ai ξ Bi ξ Ci ) i such that A i = B i = 1 for all i. Lemma 3. Let A,B,C be pairwise disjoint subsets of N: (ξ A ξ B ξ C ) ( a A,b B,D ABC \ {a,b} : C D ξ a ξ b ξ D ). Proof. See [4], Lemma 3 for a proof. 1 That is Kullback Leibler divergence ( relative entropy) between a joint distribution and a product of one dimensional marginals. 3

7 1.2 Gaussian Distributional Framework Definition. A Gaussian distribution of a random vector ξ = (ξ 1,...,ξ n ) is a probability distribution specified by its characteristic function ( ) ϕ ξ (t) = E exp(it ξ) = exp it µ t Σt, 2 where µ R n and a symmetric positive semi definite matrix Σ R n R n are mean and variance parameters, respectively. If Σ is positive definite (Σ > 0), i.e. regular, the distribution is called regular Gaussian distribution and has a density with respect to Lebesgue measure ( 1 f(x) = exp 1 ) (2π) n 2 Σ (x µ) Σ 1 (x µ) and finite multiinformation MI(ξ) = 1 2 ( log( Σ ) ) n log σ a a. On the other hand side if Σ is not regular then ξ takes values in an affine subspace of R n, e.g. exists a R n such that a=1 a (ξ µ) = 0 (a.s.), (1) see [1], pp. 71; neither a density with respect to Lebesgue measure exists nor multiinformation is generally finite. Given a matrix Σ = (σ a b ) a,b {1,...,n} and A,B subsets of {1, 2,..., n}, the submatrix with A rows and B columns will be denoted by Σ A B = (σ a b ) a A, b B. The following lemma shows that both marginal and conditional distributions of a Gaussian distribution are also Gaussian. Lemma 4. Let A and B be disjoint subsets of N = {1, 2,...,n}. i) The marginal distribution ξ A is Gaussian distribution with the variance matrix Σ A A. ii) The conditional distribution of ξ A given ξ B = x B is a Gaussian distribution with the variance matrix Σ A A B = Σ A A Σ A B Σ B B Σ B A, where Σ B B is any generalized inverse of Σ B B. iii) Moreover, if Σ > 0 then also Σ A A > 0 and Σ A A B > 0. 4

8 Proof. See [2], pp. 256 for a proof of i) and ii). Regularity of Σ A A is obvious, Cholesky decomposition of Σ > 0 yields to Σ A A B > 0. Let us emphasize that the variance matrix of conditional distribution Σ A A B does not depend on the choice of x B. Lemma 5. Let a and b be distinct elements of {1,...,n} and C {1,...,n}\ab: i) ξ a ξ b σ a b = 0. ii) If Σ C C > 0, then ξ a ξ b ξ C Σ ac bc = 0. iii) If D C such that Σ D D > 0 and rank(σ D D ) = rank(σ C C ), then ξ a ξ b ξ C ξ a ξ b ξ D Σ ad bd = 0. iv) If Σ > 0 then ξ a ξ b ξ C ( Σ 1) N\(aC) N\(bC) = 0. Proof. The first part is a well-known fact (cf. [2], pp.257). The second comes from Cholesky decomposition yielding Σ ac bc = 0 σ a b Σ a C Σ 1 C C Σ C b = 0 (Σ ab ab C ) a b = 0 ξ a ξ b ξ C. The third part follows from Lemma 4 ii) by constructing the generalized inverse (cf. [9], section 1b.5) as follows ( Σ Σ 1 C C = D D 0 ). 0 0 See [3] for a proof of iv). 1.3 Discrete Distributional Framework Definition. A random vector ξ = (ξ 1,...,ξ n ) is called discrete if each ξ a takes values in a state space X a such that 1 < X a <. If a, X a = 2 then ξ is called binary. In a binary case, without loss of generality we will assume a, X a = { 1, 1}. Definition. A discrete random vector ξ is called positive if x X = n X a : 0 < P(ξ = x) < 1. a=1 5

9 In a case of discretely distributed random vector, MI is always finite and variables ξ a and ξ b are conditionally independent given ξ C if and only if x abc X abc : P(ξ abc = x abc )P(ξ C = x C ) = P(ξ ac = x ac )P(ξ bc = x bc ). (2) As shown in the thesis the class of binary distributions can be parametrized by its moments A N as follows e A = E ξ a = x a, (3) a A x N { 1,1} N P(ξ N = x N ) a A e is presumed to be one. The parametrization (3) implies ( ) ( A 1 P(ξ A = x A ) = x b )e B 2 B A and CI condition (2) may be written in a form: ξ a ξ b ξ C if and only if x C { 1, 1} C : ( ) ( ) ( ) ( ) e abd x d e D x d = e ad x d e bd x d. D C d D D C d D D C b B CI statements ξ a ξ b ξ C, C = 0, 1, 2 are bellow: ξ a ξ b e ab = e a e b d D ξ a ξ b ξ c e abc + e ab e c = e a e bc + e b e ac e ab + e abc e c = e ac e bc + e a e b D C d D ξ a ξ b ξ cd e abcd + e abc e d + e abd e c + e ab e cd e abc + e abd e cd + e ab e c + e d e abcd e abd + e abc e cd + e ab e d + e c e abcd e ab + e c e abc + e d e abd + e cd e abcd = e a e bcd + e b e acd + e ac e bd + e ad e bc = e ad e bcd + e bd e acd + e a e bc + e b e ac = e ac e bcd + e bc e acd + e a e bd + e b e ad = e a e b + e ac e bc + e ad e bd + e acd e bcd 1.4 Independence Models Let N = {1, 2,...,n} be a finite set and T N denotes the set of all triplets ab C such that ab is an (unordered) pair of distinct elements of N and C N \ ab. Definition. Subsets of T N will be referred here as formal independence models over N. 6

10 The independence model I(ξ) induced by a random vector ξ = (ξ 1,...,ξ n ) is the independence model over {1, 2,..., n} defined as follows I(ξ) = { xy Z ; ξ x ξ y ξ Z }. Let us emphasize that an independence model I(ξ) uniquely determines also all other conditional independencies among subvectors of ξ (due to Lemma 3). Definition. An independence model I is said to be Gaussian (g ), regular Gaussian (g + ), discrete (d ), positive discrete (d + ), binary (b ) and positive binary (b + ) representable if there exists a random vector ξ such that it is Gaussian, regular Gaussian, discrete, positive discrete, binary and positive binary, respectively, and I = I(ξ). Lemma 6. Let a,b,c be distinct elements of N = {1,...,n} and D N \ abc. If an independence models I over N is either discrete or Gaussian representable, then ( { ab cd, ac D } I ) ( { ac bd, ab D } I ). (4) Moreover, if I is positively discrete or regular Gaussian representable, then ( { ab cd, ac bd } I ) = ( { ab D, ac D } I ). (5) Proof. These are so called semigraphoid and graphoid properties, cf. [2] or [15] for more details. Diagrams proposed by R. Lněnička will be used for a visualisation of independence model I over N such that N 4. Each element of N is plotted as a dot. If ab I then dots corresponding to a and b are joined by a line. If ab c I then we put a line between dots corresponding to a and b and add small line in the middle pointing in c direction. If both ab c and ab d are elements of I, then only one line with two small lines in the middle is plotted. Finally, if ab cd I is visualised by a brace between a and b. For instance, see a diagram of I = { 12, 23 1, 23 4, 34 12, 14, 14 2 } on Figure { 3 Fig. 1: Diagram of an independence model. 7

11 Two independence models I and J over N and M are isomorphic if there exists a bijective mapping π from N to M such that xy Z I π(x)π(y) π(z) J, where π(z) stands for {π(z); z Z}. See Figure 2 for an example of three isomorphic models. An equivalence class of independence models with respect to the isomorphic relation will be referred as type. It is easy to see that models of the same type are either all representable or none of them is representable. Consequently, we can classify the entire type as representable or non-representable. Fig. 2: Example of three isomorphic models. Definition. If I is an independence model over N = {1,...,n} and E,F are disjoint subsets of N then a minor I [ F E is an independence model over N \ EF I [ F E = { ab C ; E (abc) =, ab CF I}. Minorization of independence models is an analogy to operations marginalization and conditioning of distributions. Note that I [ = I and (I [F 1 E 1 ) [ F 2 E 2 = I [ F 1F 2 E 1 E 2. In the discrete distributional framework, the intersection of representable models is also representable (see Lemma 21, pp. 46 in the thesis or [17], pp. 5., for a proof): Lemma 7. If I and I are d representable (d + representable) models, then I I is d representable (d + representable). Lemmas 7 follows that if I is a d representable (d + representable) independence model, then all its minors I [ F E are also d representable (d+ representable); i.e. the classes of d /d + representable models are closed to the operation of minorization (cf. Lemma 22, pp. 46 in the thesis). Also the classes of g /g + representable models are closed to the minors 2 due to Lemma 4 (cf. Lemma 11, pp. 26 in the thesis). 2 However, this property does not hold for the classes of b /b + representable models: I = { ab cd, ab d } is b + representable but I [ d = { ab c, ab } is not b representable. 8

12 1.5 Graphical Independence Models In the applications, the independence model determining a class of probability distributions is not usually given as a list of prescribed conditional independencies but as an undirected or directed acyclic graph. That is not only for interpretation reasons 3 but also because of nice theoretical properties of such models (like exact formulas for ML estimates and computationally efficient ways to evaluate conditional distributions in case of decomposable models). The theory is described in details in [2] and [8]. For a short introduction see Chapter 4 in the thesis. Here, only undirected graphs and independence models derived by a global Markov property will be presented. Definition. An undirected graph G = (N,E) is a pair of a finite vertex set N and a set of undirected edges between these vertices, ( ) N E = {ab; a,b N,a b}. 2 Usually, vertices are visualized as dots and edges as lines between them (see Figure 3). A path between two distinct vertices a,b N is a sequence of k 2 vertices a = v 1,..., v k = b such that v i v i+1 E, i = 1,2,...,k 1. Let us assume a,b N are two distinct vertices and C N \ ab. The vertices a and b is said to be separated by C (in symbols a b C) if every path between a and b contains a vertex from C. Fig. 3: An undirected graph G. Definition. An independence model derived from a graph 4 G = (N,E) is an independence model over N defined as follows I(G) = { ab C ; a b C in a graph G}. 3 An edge stands for a significant correlation, an arrow means direct causality. 4 There exist more ways how to derive a model from a graph, see pp. 61 in the thesis. For a simplicity reason we will now restrict ourselves to so called global separation criterion. 9

13 Example. Consider a graph G = (N,E) on Figure 3. The vertex set is N = {1, 2, 3, 4, 5}; the graph contains edges E = {12, 13, 23, 24, 34, 35}. Every two vertices are connected by a path. Vertices 2 and 5 are separated by a vertex 3 or by a set of vertices 14 but not just by vertex 1 or 4. An independence model derived from the graph G is I(G) = { 14 23, , 15 3, 15 34, 15 23, , 25 3, 25 34, 25 13, , 45 3, 45 13, 45 23, }. The main drawback of such approach is that there are much more representable models than just graphical ones. For instance, in the case N = 4 there are 11 graphical types only while there are 80 g representable, 53 g + representable, 1098 d representable and at least 304 d + representable types. On the other hand side, it is unfeasible to fit data to the most of non graphical independence models. 2 No Finite Characterization Exists Definition. An independence model J over M is forbidden minor for a class of independence models C if no model in C has M minor isomorphic to J. A class of independence models C is characterized by a finite collection of forbidden minors D if any independence model I satisfies that I C if and only if no minor of I is isomorphic to any element of D. Note that other possibilities how to definite finite characterization exist like a characterization by special inference rules (cf. [17]). Theorem 1. None of the classes of g /g + /d /d + /b /b + representable models can be characterized by a finite collection of forbidden minors. The Theorem 1 is proved at the end of Chapters 2 and 3 in the thesis, Theorems 3 and 5 (or see [11]). 3 Representable Models Over N = {1, 2, 3, 4} For N consisting of less than four elements, the problem of representability is easy to solve. All models over N 2 are representable in any above mentioned sense. Potential candidates for representability over N = {1, 2, 3} are models fulfilling properties (4) and (5) from Lemma 6 that fall into 10 types (8 types in case of positive representability). Regarding d /d + representability there are no other constraints (all models representable). Regarding b /b + representability I = { ab c, ab } is not binary representable (cf. Lemma 19, pp. 42 in the thesis). Regarding g /g + representability I = { ab c, ab }, I = { ab, ac } and I = { ab, ac, bc } are not Gaussian representable (cf. Lemma 10, pp. 24 in the thesis). That is why we focus on N = {1, 2, 3, 4} from now to the end of the section. 10

14 3.1 Gaussian Distributional Framework The g + representable models must necessarily have g + representable minors. The list of all types on N = {1, 2, 3, 4} that have g + representable minors are plotted in Figure 4 on page 12. As shown in [3], the last 5 ones, M54 M58, are not g + representable. For the rest of them the corresponding variance matrix has been found by a search through a space of all integer positive finite matrices. Therefore, there are 629 g + representable independence models corresponding to 53 types, M1 M53. For the general Gaussian representability, in [12] (or the thesis, Lemma 13, pp. 29) the following two properties are shown: Lemma 8. Let I be a g representable independence model and a, b, c and d distinct elements of N. i) if { ab c, ac b } I then either ξ b ξ c or { ab, ac } I, ii) if { ab cd, ac bd } I, then either ξ b ξ c or { ab d, ac d } I or ad bc I, where ξ b ξ c stands for functional dependence 5 between ξ b and ξ c yielding { ab c, bd c, ac b, cd b, ab cd, bd ac, ac bd, cd ab } I. M1 M88 in Figure 4 and 5 are types having g representable minors and not contradicting Lemma 8. However, it is proved in [12] (or the thesis, Lemma 14, pp. 30) that just M54 M58 and M86 M88 are not g representable. Therefore, there are 877 g representable independence models falling to 80 types. Achieved results are summarized in the following theorem: Theorem 2. There exist 877 g representable independence models over N = 4 assorting into 80 types. Among them there are 629 g + representable independence models (53 types) and these are just the models meeting a condition (5). There exists a conjecture that all g + representable independence models are b + representable. To show this on N = 4 b + representations of M25 and M37 are needed. The former analogous conjecture that all g representable independence models are b representable was rejected due to M71 which is g representable but cannot be represented by a discrete distribution (or any other distribution with finite multiinformation). The problem is listed in Open problem chapter of [15]. 5 i.e. correlation between ξ b and ξ c equals ±1 11

15 Fig. 4: List of types M1 M53 and their Gaussian representations. 12

16 Fig. 5: List of types M54 M80 and their Gaussian representations. 13

17 3.2 Discrete Distributional Framework The problem of (general) discrete representability has been solved by F. Matúš in [5], [6] and [7]. See [16] for an overview of the results. Only partial results exist for d + representability and b /b + representability at this moment (cf. [14] or Chapter 3 in the thesis) Discrete representability In brief, due to Lemma 7 an intersection of two d representable models is also d representable. As a result, the class of all d representable models over N can be described by a set C of so called irreducible models, i.e. nontrivial 6 d representable models that cannot be written as an intersection of other two d representable models. It is not difficult to evidence that a nontrivial independence model I is d representable if and only if there exists A C such that I = C A C. As shown in [5], [6] and [7] there are only only 13 types of irreducible models, see Figure 6. Fig. 6: Irreducible types. Theorem 3. There exist d representable independence models assorting into 1098 types. 6 set of all triplets over N, i.e. T N 14

18 3.2.2 Positive binary representability There are many special properties of b + representable models. A few of them are listed below (proved in Lemma 20, pp. 43 in the thesis): Lemma 9. If I is b + representable independence model over N and a,b,c,d are distinct elements of N then i) { ab cd, ab c, ad, cd } I = { ad c, cd a } I bd I ii) { ab cd, ab c, ad, bd } I = { ad b, bd a } I cd I iii) { ab cd, ab, bc, cd, ad } I = { ad c, cd a } I { bd c, cd b } I iv) { ab cd, ad b, cd b } I = { bc, cd, bc d } I ab c I v) { ab cd, ab, bc, bd, ab c, bc a } I = ad I ac I cd I ad c I { ab d, bd a } I vi) { ab cd, ab d, bc a } I = ac I ad I cd I cd a I ac d I Using a parametrization from Section 1 a search for rational representations was performed. The results are 139 b + representable types. However, a lot of undecided types with respect to b + representability still exist Positive discrete representability On the one hand, all d + representable independence models must be d representable models and must fulfill a condition (5). There are 5547 such a models (356 types). On the other hand, all b + representable models are d + representable. If we close 139 b + representable types found above to intersection (thanks to Lemma 7), we obtained 4555 d + representable models (299 types). The = 57 undecided types are shown on Figure 7. Further, types P12, P49, P50, P52, P54 (5 types,52 models) were found d + representable just by search for some representation (with a little help from computational software). We may conclude (Theorem 4, pp. 51 in the thesis): Theorem 4. There exist between 4607 and 5547 d + representable independence models ( types). It has been conjectured that all 5547 models (356 types) are d + representable. To prove this representations of types P2, P3, P19, P24, P30, P32, P37, P39, P41, P45, P46 and P57 from Figure 7 have to be found. 15

19 Fig. 7: d representable types meeting (5) but not found as an intersection of 139 types known to be b + representable. 16

20 4 Probability Distributions Over an Independence Model Definition. If I is an independence model then a family of distributions of a random vector ξ such that I I(ξ) will be denoted by P(I). Usually, we fix the distributional framework. This will be indicated by adding a lower index to P. For example P g ({ 12 }) is a family of all Gaussian distributions ξ such that ξ 1 ξ 2 (i.e. an element of a variance matrix σ 1 2 = 0). 4.1 Connection Between Graphical Independence Models and Exponential Families In Regular Distributional Framework In the regular Gaussian distributional framework there is an interesting connection between graphical independence models and exponential families: All graphical models are g + representable and families of (regular) Gaussian distributions over them are exponential as stated in the following lemma (cf. [2]). Lemma 10. If G is an undirected graph then P g +(I(G)) is exponential. More important, this property can be reversed. A sketch of a proof is included in [13] but the idea comes from an oral communication with F. Matúš. Lemma 11. If P g +(I) is exponential then there exists an undirected graph G such that I = I(G). 4.2 Model Search Searching for an appropriate statistical model based on data consists of two processes: parametrization of P(I) and estimation of parameters (usually by maximalization of log-likelihood function) over P(I) for I fixed; and searching the right independence model I either by CI testing or by maximization of some information criterion (like AIC or BIC). For an overview see Section 4.2 and 4.3 in the thesis. Consider now N = 4 and a regular Gaussian distributional framework to be able to parameterize all representable models. There are 53 g + representable types but only 11 of them are graphical. Let X be a dataset with 4 columns and k lines (observations). An estimate of a mean parameter will be the same for all models ˆµ a = 1 k k X ba, a = 1,...,4. b=1 17

21 The variance matrix Σ may be parameterized as σ Σ = 0 σ σ 33 0 Σ σ 44 where Σ 0 = 1 ρ 12 ρ 13 ρ 14 ρ 21 1 ρ 23 ρ 24 ρ 31 ρ 32 1 ρ 34 ρ 41 ρ 42 ρ 43 1 σ σ σ σ 44 Often parametrization by an inverse variance matrix V = Σ 1 is more natural. Analogously as above V may be parameterize as v v V = 0 v v 33 0 V 0 v v 33 0, v v 44 where V 0 is a one diagonal matrix. Let us denote V 0 = (w ij ) i,j=1,...,4. The correspondence of independence restrictions on Σ and V has been described in Lemma 5 iv). The parameters σ 11,...,σ 44 can be estimated as., ˆσ 2 aa = 1 k k (X ba ˆµ a ) 2, a = 1,...,4; b=1 analogously, v 11,...,v 44 can be estimated as diagonal of maximum likelihood Σ 1 estimate in saturated model (no independence restrictions). Constraints for the rest of parameters according to the type M1 M53 (see Figure 4, pp. 12) are listed below. The estimates of non diagonal elements of Σ 0 or V 0 must be found by some iterative optimization method (like optim in R environment, [10]). M1) ρ 12 = 0 M2) ρ 12 = ρ 14 ρ 24 M3) ρ 12 = ρ 14ρ 24 +ρ 13 ρ 23 ρ 13 ρ 24 ρ 34 ρ 14 ρ 23 ρ 34 1 ρ 2 34 M4) ρ 12 = 0, ρ 13 = ρ 14(ρ 23 ρ 34 ρ 24 ) ρ 24 ρ 34 ρ 23 M5) ρ 12 = 0, ρ 34 = ρ 23(1 ρ 2 14 )+ρ 13ρ 24 ρ 14 ρ 24 M6) ρ 12 = 0, ρ 34 = ρ 13 ρ 14 + ρ 23 ρ 24 18

22 M7) ρ 12 = 0, ρ 34 = ρ 14 ρ 13 M8) ρ 12 = ρ 14 ρ 24, ρ 34 = ρ 14 ρ 13 M9) w 12 = 0, w 34 = 0 M10) ρ 12 = ρ 14 ρ 24, ρ 13 = ρ 34+ρ 23 ρ 24 ρ 2 14 ρ 23ρ 24 ρ 2 14 ρ2 24 ρ 34 ρ 14 (1 ρ 2 24 ) M11) ρ 12 = 0, ρ 23 = ρ 24 ρ 34 M12) ρ 12 = 0, ρ 34 = 0 M13) ρ 12 = ρ 14 ρ 24, ρ 13 = ρ 14ρ 24 ρ 23 M14) ρ 12 = ρ 14 ρ 24, ρ 13 = ρ 14(1 ρ 2 23 ρ2 24 +ρ 23ρ 24 ρ 34 ) ρ 34 ρ 23 ρ 24 M15) ρ 14 = ρ 13 ρ 34, ρ 12 = ρ 24 ρ 13 ρ 34 M16) w 14 = 0, w 23 = 0, w 12 = w 13w 24 w 34 1 w 2 34 M17) ρ 12 = 0, ρ 14 = ρ 13 ρ 34, ρ 23 = ρ 34(1 ρ 2 13 ) ρ 24 M18) ρ 12 = 0, ρ 34 = ρ 13 ρ 14, ρ 24 = ρ 14(1 ρ 2 13 ρ2 23 ) ρ 13 ρ 23 M19) ρ 12 = 0, ρ 34 = 0, ρ 13 = ρ 23(1 ρ 2 14 ) ρ 14 ρ 24 M20) w 12 = 0, w 34 = 0, w 13 = w 14w 24 w 23 M21) ρ 12 = 0, ρ 34 = ρ 13 ρ 14, ρ 23 = ρ 24ρ 14 (1 ρ 2 13 ) ρ 13 (1 ρ 2 14 ) M22) ρ 12 = 0, ρ 34 = ρ 23 ρ 24, ρ 13 = ρ 23ρ 24 ρ 14 M23) ρ 12 = 0, ρ 23 = ρ 24 ρ 34, ρ 14 = ρ 13 ρ 34 M24) ρ 12 = 0, ρ 34 = ρ 23 ρ 24, ρ 14 = ρ 13 ρ 23 ρ 24 M25) ρ 12 = 0, ρ 14 = ρ 13 ρ 34, ρ 23 = 1 ρ2 24 ρ2 34 ρ 24 ρ 34 M26) ρ 12 = ρ 14 ρ 24, ρ 13 = ρ 14ρ 24 ρ 23, ρ 34 = ρ 23 ρ 24 M27) ρ 12 = 0, ρ 14 = ρ 13 ρ 34, ρ 24 = ρ 23(1 ρ 2 13 ρ2 34 ) ρ 34 (1 ρ 2 13 ) M28) ρ 12 = ρ 14 ρ 24, ρ 23 = ρ 14 ρ 24 ρ 13, ρ 34 = ρ 14 ρ 2 24ρ 13 M29) w 14 = 0, w 12 = w 13 w 23, w 34 = w 23 w 24 M30) ρ 12 = 0, ρ 34 = 0, ρ 13 = ρ 14ρ 24 ρ 23 19

23 M31) w 23 = 0, w 14 = w 13 w 34, w 12 = w 13 w 34 w 24 M32) w 34 = 0, w 12 = w 14 w 24, w 13 = w 14w 24 w 23 M33) ρ 12 = 0, ρ 23 = 0 M34) w 12 = 0, w 23 = 0 M35) ρ 12 = 0, ρ 34 = 0, ρ 13 = ±ρ 24, ρ 14 = ρ 23 M36) ρ 34 = ρ 23 ρ 24, ρ 12 = ±ρ 23 ρ 24, ρ 13 = ±ρ 24, ρ 14 = ±ρ 23 M37) ρ 12 = 0, ρ 23 = ρ 24 ρ 34, ρ 13 = ± 1 ρ 2 24, ρ 14 = ±ρ 34 1 ρ 2 24 M38) ρ 12 = 0, ρ 14 = 0, ρ 23 = ρ 24(1 ρ 2 13 ) ρ 34 M39) ρ 12 = 0, ρ 23 = 0, ρ 13 = ρ 14 ρ 34 M40) w 14 = 0, w 24 = 0, w 13 = w 12(1 w 2 34 ) w 23 M41) w 14 = 0, w 24 = 0, w 12 = w 13 w 23 M42) ρ 12 = 0, ρ 13 = 0, ρ 23 = 0 M43) w 12 = 0, w 13 = 0, w 23 = 0 M44) ρ 12 = 0, ρ 23 = 0, ρ 14 = ρ 13 ρ 34 M45) ρ 12 = 0, ρ 23 = 0, ρ 34 = 0 M46) w 12 = 0, w 23 = 0, w 34 = 0 M47) ρ 12 = 0, ρ 13 = 0, ρ 14 = 0 M48) ρ 12 = 0, ρ 23 = 0, ρ 13 = 0, ρ 14 = 0 M49) w 12 = 0, w 13 = 0, w 23 = 0, w 14 = 0 M50) ρ 12 = 0, ρ 14 = 0, ρ 23 = 0, ρ 34 = 0 M51) ρ 12 = 0, ρ 13 = 0, ρ 14 = 0, ρ 23 = 0, ρ 34 = 0 M52) ρ 12 = 0, ρ 13 = 0, ρ 14 = 0, ρ 24 = 0, ρ 23 = 0, ρ 34 = 0 M53) saturated model (no constraints) The model selection can be done by AIC/BIC criterion. Dimension of a parameter space D equals 14 minus a number of independence constraints for the given type. For example, in case of M11 D = 14 2 = 12 because there are constraints on ρ 12 and ρ

24 4.3 Simulation Study The variance matrix estimates and model selection algorithm from the previous subsection have been implemented in R package yest attached to the thesis. Model selection process takes about 9 minutes (AMD Sempron 3500+, 1800 GHz, 1 GB RAM). Note that because the sufficient statistics for all estimates above is a maximum likelihood parameters estimate for saturated model, the size of a dataset does not highly influence the time of computation. Validity has been verified by the following algorithm: 1. For each of 53 types take one model (denoted by I) and randomly generate an inverse variance matrix corresponding to I (denoted by V ). 2. Generate k lines dataset, each line from multivariate normal distribution with zero mean parameter and variance matrix V Based on AIC/BIC select one of 629 g + representable independence models (denoted by J). 4. Model selection is called successful if I = J. Otherwise, it is not successful. Transcription to R follows I<-ind.identification(type=testovany.typ)$model V<-ind.rgauss(model=I) data<-generate.data(v,k) J<-model.selection(data)$model if (I==J) print( Success ) else print( Failure ) AIC BIC probabilityof success probabilityof success log(k) log(k) Fig. 8: Success rate of AIC and BIC model selection. 21

25 The simulation was repeated for each of 53 types and 15 possible value of k four times. A graph of average success rate in dependence to log(k) is shown on Figure 8. Failure of a model selection does not automatically imply bad behavior of a variance matrix estimate. The simulation study has been repeated but now Kullback Leibler divergence 7 between estimated and the true distribution has been measured. Four estimates have been evaluated: an estimate based on saturated model (the usual one), an estimate based on BIC model selection over graphical models, an estimate based on BIC model selection over all independence models and an estimate based on actual independence model. Results are visualized on Figure 9: trimmed 8 (20%) average difference between KL divergence of inspected estimate to the true model and KL divergence of estimated based on saturated model to the true model is drawn in dependence to a logarithm of a number of lines in a dataset k. Trimmed Mean Detail difference difference log(k) log(k) Fig. 9: Saturated model full line, graphical models dashed line, independence models dotted line, actual independence model dash-dotted line. We may conclude that to have a good chance (> 80%) to determine the true independence model based on BIC model selection we need about 50 thousands of observations. However, for such a large amount of data the quality of a variance estimator is similar as the quality of an estimator based on a saturated model (the usual one). It seems that the improvement (in a sense average KL divergence with comparison to an estimator based on saturated model) is significant if the number of observations is between 200 and It is obvious that the quality of 7 That is DIV (µ,σ, µ, Σ) = 1 2 ( log ( Σ Σ ) + tr ( Σ 1 Σ µ,σ, µ, Σ are parameters of the first and the second distribution. 8 Without trimming the figures are similar but a bit more messy. ) ) + ( µ µ) Σ 1 ( µ µ) 4, where 22

26 estimation of a variance matrix using just the graphical models is unsatisfactory if the true model is not graphical and the number of observations is below An interesting direction of future research might be study of distributional families above in terms of curved exponential families (to speed up the computation) and estimation of variance matrices for more that four Gaussian distributed random variables. 23

27 References [1] J. Anděl. Matematická statistika. SNTL/ALFA, [2] S. L. Lauritzen. Graphical Models. Oxford University Press, [3] R. Lněnička. On Gaussian conditional independence structures. Technical report, ÚTIA AV ČR, [4] F. Matúš. On equivalence of Markov properties over undirected graphs. Journal of Applied Probability, 29: , [5] F. Matúš and M. Studený. Conditional independences among four random variables I. Combinatorics, Probability & Computing, 4: , [6] F. Matúš. Conditional independences among four random variables II. Combinatorics, Probability & Computing, 4: , [7] F. Matúš. Conditional independences among four random variables III. Combinatorics, Probability & Computing, 8: , [8] R.E. Neapolitan. Learning Bayesian Networks. Chapman & Hall/CRC, [9] R. C. Rao. Linear Statistical Inference and Its Applications. John Wiley & Sons, [10] R Development Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, ISBN [11] P. Šimeček. Classes of Gaussian, discrete and binary representable independence models have no finite characterization. In M. Hušková and M. Janžura, editors, Proceedings of Prague Stochastics 2006, volume CD, pages MATFYZPRESS, [12] P. Šimeček. Gaussian representation of independence models over four random variables. In A. Rizzi, editor, Proceedings of COMPSTAT 2006, volume CD, pages Springer, [13] P. Šimeček. Independence models. In J. Vejnarová, editor, Proceedings of Workshop on Uncertainty Processing 2006, pages University of Economics, [14] P. Šimeček. A short note on discrete representability of independence models. In M. Studený and J. Vomlel, editors, Proceedings of Workshop on Probabilistic Graphical Models 2006, pages Action M Agency,

28 [15] M. Studený. Probabilistic Conditional Independence Structures. Springer, [16] M. Studený and P. Boček. CI-models arising among 4 random variables. In Proceedings of the 3rd workshop WUPES, pages , [17] M. Studený and J. Vejnarová. The multiinformation function as a tool for measuring stochastic dependence. In M. I. Jordan, editor, Learning in Graphical Models, pages Kluwer,

29 List of Publications P. Šimeček. A short note on structure learning. In Jana Šafránková, editor, Proceedings of the 14th Annual Conference of Doctoral Students - WDS 2004, pp , Prague, Matfyzpress. P. Šimeček and M. Studený. Využití Hilbertovy báze k ověřování shodnosti strukturálních a kombinatorických imsetů. In Jaromír Antoch and Gejza Dohnal, editors, Sborník prací 13. letní školy JČMF ROBUST, pp Jednota českých matematiků a fyziků, P. Šimeček. On the minimal probability of intersection of conditionally independent events. Submitted to Kybernetika, P. Šimeček. Gene Expression Data Analysis for In Vitro Toxicology. In Jaromír Antoch and Gejza Dohnal, editors, Sborník prací 14. letní školy JČMF ROBUST, pp Jednota českých matematiků a fyziků, Č. Štuka and P. Šimeček. Studium souvislostí mezi úspěšností studia medicíny, známkami na střední škole a výsledky přijímacích zkoušek. In Sborník konference MEDSOFT Action M Agency, P. Šimeček. Classes of Gaussian, discrete and binary representable independence models have no finite characterization. In M. Hušková a M. Janžura, editors, Proceedings of Prague Stochastics 2006, CD volume, pp MATFYZPRESS, P. Šimeček. Gaussian representation of independence models over four random variables. V A. Rizzi, editor, Proceedings of COMPSTAT 2006, CD volume, pp Springer, P. Šimeček. A short note on discrete representability of independence models. In M. Studený and J. Vomlel, editors, Proceedings of Workshop on Probabilistic Graphical Models 2006, pp Action M Agency, P. Šimeček. Independence models. In J. Vejnarová, editor, Proceedings of Workshop on Uncertainty Processing 2006, pp University of Economics,

Graphical Models and Independence Models

Graphical Models and Independence Models Graphical Models and Independence Models Yunshu Liu ASPITRG Research Group 2014-03-04 References: [1]. Steffen Lauritzen, Graphical Models, Oxford University Press, 1996 [2]. Christopher M. Bishop, Pattern

More information

MATEMATICKO-FYZIKÁLNÍ FAKULTA UNIVERZITA KARLOVA. Katedra algebry GRUP A LUP. Petr Vojtěchovský. Školitel: doc. Aleš Drápal, CSc.

MATEMATICKO-FYZIKÁLNÍ FAKULTA UNIVERZITA KARLOVA. Katedra algebry GRUP A LUP. Petr Vojtěchovský. Školitel: doc. Aleš Drápal, CSc. MATEMATICKO-FYZIKÁLNÍ FAKULTA UNIVERZITA KARLOVA Katedra algebry SOUVISLOSTI KÓDŮ, GRUP A LUP Autoreferát doktorské disertační práce Petr Vojtěchovský Školitel: doc. Aleš Drápal, CSc. Obor M2-Algebra PRAHA

More information

On Markov Properties in Evidence Theory

On Markov Properties in Evidence Theory On Markov Properties in Evidence Theory 131 On Markov Properties in Evidence Theory Jiřina Vejnarová Institute of Information Theory and Automation of the ASCR & University of Economics, Prague vejnar@utia.cas.cz

More information

AUTOREFERÁT DISERTAČNÍ PRÁCE

AUTOREFERÁT DISERTAČNÍ PRÁCE Univerzita Karlova v Praze Matematicko-fyzikální fakulta AUTOREFERÁT DISERTAČNÍ PRÁCE Mgr. Michal Houda Stabilita a aproximace pro úlohy stochastického programování Katedra pravděpodobnosti a matematické

More information

Representing Independence Models with Elementary Triplets

Representing Independence Models with Elementary Triplets Representing Independence Models with Elementary Triplets Jose M. Peña ADIT, IDA, Linköping University, Sweden jose.m.pena@liu.se Abstract An elementary triplet in an independence model represents a conditional

More information

CHARLES UNIVERSITY, PRAGUE Faculty of Mathematics and Physics Department of Numerical Mathematics ON SOME OPEN PROBLEMS IN KRYLOV SUBSPACE METHODS

CHARLES UNIVERSITY, PRAGUE Faculty of Mathematics and Physics Department of Numerical Mathematics ON SOME OPEN PROBLEMS IN KRYLOV SUBSPACE METHODS CHARLES UNIVERSITY, PRAGUE Faculty of Mathematics and Physics Department of Numerical Mathematics ON SOME OPEN PROBLEMS IN KRYLOV SUBSPACE METHODS Summary of Ph.D. Thesis April 2002 Petr Tichý Branch:

More information

On Conditional Independence in Evidence Theory

On Conditional Independence in Evidence Theory 6th International Symposium on Imprecise Probability: Theories and Applications, Durham, United Kingdom, 2009 On Conditional Independence in Evidence Theory Jiřina Vejnarová Institute of Information Theory

More information

Conditional Independence and Markov Properties

Conditional Independence and Markov Properties Conditional Independence and Markov Properties Lecture 1 Saint Flour Summerschool, July 5, 2006 Steffen L. Lauritzen, University of Oxford Overview of lectures 1. Conditional independence and Markov properties

More information

Lecture 4 October 18th

Lecture 4 October 18th Directed and undirected graphical models Fall 2017 Lecture 4 October 18th Lecturer: Guillaume Obozinski Scribe: In this lecture, we will assume that all random variables are discrete, to keep notations

More information

On the Conditional Independence Implication Problem: A Lattice-Theoretic Approach

On the Conditional Independence Implication Problem: A Lattice-Theoretic Approach On the Conditional Independence Implication Problem: A Lattice-Theoretic Approach Mathias Niepert Department of Computer Science Indiana University Bloomington, IN, USA mniepert@cs.indiana.edu Dirk Van

More information

Tensor rank-one decomposition of probability tables

Tensor rank-one decomposition of probability tables Tensor rank-one decomposition of probability tables Petr Savicky Inst. of Comp. Science Academy of Sciences of the Czech Rep. Pod vodárenskou věží 2 82 7 Prague, Czech Rep. http://www.cs.cas.cz/~savicky/

More information

Applying Bayesian networks in the game of Minesweeper

Applying Bayesian networks in the game of Minesweeper Applying Bayesian networks in the game of Minesweeper Marta Vomlelová Faculty of Mathematics and Physics Charles University in Prague http://kti.mff.cuni.cz/~marta/ Jiří Vomlel Institute of Information

More information

Tutorial: Gaussian conditional independence and graphical models. Thomas Kahle Otto-von-Guericke Universität Magdeburg

Tutorial: Gaussian conditional independence and graphical models. Thomas Kahle Otto-von-Guericke Universität Magdeburg Tutorial: Gaussian conditional independence and graphical models Thomas Kahle Otto-von-Guericke Universität Magdeburg The central dogma of algebraic statistics Statistical models are varieties The central

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear

More information

Likelihood Analysis of Gaussian Graphical Models

Likelihood Analysis of Gaussian Graphical Models Faculty of Science Likelihood Analysis of Gaussian Graphical Models Ste en Lauritzen Department of Mathematical Sciences Minikurs TUM 2016 Lecture 2 Slide 1/43 Overview of lectures Lecture 1 Markov Properties

More information

Faithfulness of Probability Distributions and Graphs

Faithfulness of Probability Distributions and Graphs Journal of Machine Learning Research 18 (2017) 1-29 Submitted 5/17; Revised 11/17; Published 12/17 Faithfulness of Probability Distributions and Graphs Kayvan Sadeghi Statistical Laboratory University

More information

Bayesian Learning in Undirected Graphical Models

Bayesian Learning in Undirected Graphical Models Bayesian Learning in Undirected Graphical Models Zoubin Ghahramani Gatsby Computational Neuroscience Unit University College London, UK http://www.gatsby.ucl.ac.uk/ Work with: Iain Murray and Hyun-Chul

More information

Learning Marginal AMP Chain Graphs under Faithfulness

Learning Marginal AMP Chain Graphs under Faithfulness Learning Marginal AMP Chain Graphs under Faithfulness Jose M. Peña ADIT, IDA, Linköping University, SE-58183 Linköping, Sweden jose.m.pena@liu.se Abstract. Marginal AMP chain graphs are a recently introduced

More information

Chris Bishop s PRML Ch. 8: Graphical Models

Chris Bishop s PRML Ch. 8: Graphical Models Chris Bishop s PRML Ch. 8: Graphical Models January 24, 2008 Introduction Visualize the structure of a probabilistic model Design and motivate new models Insights into the model s properties, in particular

More information

Maximization of Multi - Information

Maximization of Multi - Information Maximization of Multi - Information Week of Doctoral Students 2007 Jozef Juríček http://www.adultpdf.com Academy of Sciences of the Czech Republic Created by Image To PDF trial version, Institute to remove

More information

Mikroakcelerometrická měření negravitačních sil v okolí Země (simulace a analýza dat družice Mimosa) Autoreferát dizertační práce.

Mikroakcelerometrická měření negravitačních sil v okolí Země (simulace a analýza dat družice Mimosa) Autoreferát dizertační práce. Matematicko-fyzikální fakulta Univerzita Karlova v Praze Astronomický ústav Akademie věd České republiky Mikroakcelerometrická měření negravitačních sil v okolí Země (simulace a analýza dat družice Mimosa)

More information

Learning from Sensor Data: Set II. Behnaam Aazhang J.S. Abercombie Professor Electrical and Computer Engineering Rice University

Learning from Sensor Data: Set II. Behnaam Aazhang J.S. Abercombie Professor Electrical and Computer Engineering Rice University Learning from Sensor Data: Set II Behnaam Aazhang J.S. Abercombie Professor Electrical and Computer Engineering Rice University 1 6. Data Representation The approach for learning from data Probabilistic

More information

THE VINE COPULA METHOD FOR REPRESENTING HIGH DIMENSIONAL DEPENDENT DISTRIBUTIONS: APPLICATION TO CONTINUOUS BELIEF NETS

THE VINE COPULA METHOD FOR REPRESENTING HIGH DIMENSIONAL DEPENDENT DISTRIBUTIONS: APPLICATION TO CONTINUOUS BELIEF NETS Proceedings of the 00 Winter Simulation Conference E. Yücesan, C.-H. Chen, J. L. Snowdon, and J. M. Charnes, eds. THE VINE COPULA METHOD FOR REPRESENTING HIGH DIMENSIONAL DEPENDENT DISTRIBUTIONS: APPLICATION

More information

Markov properties for undirected graphs

Markov properties for undirected graphs Graphical Models, Lecture 2, Michaelmas Term 2011 October 12, 2011 Formal definition Fundamental properties Random variables X and Y are conditionally independent given the random variable Z if L(X Y,

More information

CSC 412 (Lecture 4): Undirected Graphical Models

CSC 412 (Lecture 4): Undirected Graphical Models CSC 412 (Lecture 4): Undirected Graphical Models Raquel Urtasun University of Toronto Feb 2, 2016 R Urtasun (UofT) CSC 412 Feb 2, 2016 1 / 37 Today Undirected Graphical Models: Semantics of the graph:

More information

MTH 309 Supplemental Lecture Notes Based on Robert Messer, Linear Algebra Gateway to Mathematics

MTH 309 Supplemental Lecture Notes Based on Robert Messer, Linear Algebra Gateway to Mathematics MTH 309 Supplemental Lecture Notes Based on Robert Messer, Linear Algebra Gateway to Mathematics Ulrich Meierfrankenfeld Department of Mathematics Michigan State University East Lansing MI 48824 meier@math.msu.edu

More information

Classification of semisimple Lie algebras

Classification of semisimple Lie algebras Chapter 6 Classification of semisimple Lie algebras When we studied sl 2 (C), we discovered that it is spanned by elements e, f and h fulfilling the relations: [e, h] = 2e, [ f, h] = 2 f and [e, f ] =

More information

Intrinsic products and factorizations of matrices

Intrinsic products and factorizations of matrices Available online at www.sciencedirect.com Linear Algebra and its Applications 428 (2008) 5 3 www.elsevier.com/locate/laa Intrinsic products and factorizations of matrices Miroslav Fiedler Academy of Sciences

More information

Learning Multivariate Regression Chain Graphs under Faithfulness

Learning Multivariate Regression Chain Graphs under Faithfulness Sixth European Workshop on Probabilistic Graphical Models, Granada, Spain, 2012 Learning Multivariate Regression Chain Graphs under Faithfulness Dag Sonntag ADIT, IDA, Linköping University, Sweden dag.sonntag@liu.se

More information

Geometry of Gaussoids

Geometry of Gaussoids Geometry of Gaussoids Bernd Sturmfels MPI Leipzig and UC Berkeley p 3 p 13 p 23 a 12 3 p 123 a 23 a 13 2 a 23 1 a 13 p 2 p 12 a 12 p p 1 Figure 1: With The vertices Tobias andboege, 2-faces ofalessio the

More information

STRUCTURE OF GEODESICS IN A 13-DIMENSIONAL GROUP OF HEISENBERG TYPE

STRUCTURE OF GEODESICS IN A 13-DIMENSIONAL GROUP OF HEISENBERG TYPE Steps in Differential Geometry, Proceedings of the Colloquium on Differential Geometry, 25 30 July, 2000, Debrecen, Hungary STRUCTURE OF GEODESICS IN A 13-DIMENSIONAL GROUP OF HEISENBERG TYPE Abstract.

More information

Laplacian Integral Graphs with Maximum Degree 3

Laplacian Integral Graphs with Maximum Degree 3 Laplacian Integral Graphs with Maximum Degree Steve Kirkland Department of Mathematics and Statistics University of Regina Regina, Saskatchewan, Canada S4S 0A kirkland@math.uregina.ca Submitted: Nov 5,

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

Conditional Independence

Conditional Independence H. Nooitgedagt Conditional Independence Bachelorscriptie, 18 augustus 2008 Scriptiebegeleider: prof.dr. R. Gill Mathematisch Instituut, Universiteit Leiden Preface In this thesis I ll discuss Conditional

More information

Representing Independence Models with Elementary Triplets

Representing Independence Models with Elementary Triplets Representing Independence Models with Elementary Triplets Jose M. Peña ADIT, IDA, Linköping University, Sweden jose.m.pena@liu.se Abstract In an independence model, the triplets that represent conditional

More information

Exact distribution theory for belief net responses

Exact distribution theory for belief net responses Exact distribution theory for belief net responses Peter M. Hooper Department of Mathematical and Statistical Sciences University of Alberta Edmonton, Canada, T6G 2G1 hooper@stat.ualberta.ca May 2, 2008

More information

Discrete Mathematics. Benny George K. September 22, 2011

Discrete Mathematics. Benny George K. September 22, 2011 Discrete Mathematics Benny George K Department of Computer Science and Engineering Indian Institute of Technology Guwahati ben@iitg.ernet.in September 22, 2011 Set Theory Elementary Concepts Let A and

More information

Eves Equihoop, The Card Game SET, and Abstract SET

Eves Equihoop, The Card Game SET, and Abstract SET Eves Equihoop, The Card Game SET, and Abstract SET Arthur Holshouser 3600 Bullard St Charlotte, NC, USA Harold Reiter Department of Mathematics, University of North Carolina Charlotte, Charlotte, NC 83,

More information

Probabilistic Reasoning: Graphical Models

Probabilistic Reasoning: Graphical Models Probabilistic Reasoning: Graphical Models Christian Borgelt Intelligent Data Analysis and Graphical Models Research Unit European Center for Soft Computing c/ Gonzalo Gutiérrez Quirós s/n, 33600 Mieres

More information

Basic Equations and Inequalities

Basic Equations and Inequalities Hartfield College Algebra (Version 2017a - Thomas Hartfield) Unit ONE Page - 1 - of 45 Topic 0: Definition: Ex. 1 Basic Equations and Inequalities An equation is a statement that the values of two expressions

More information

DEEP LEARNING CHAPTER 3 PROBABILITY & INFORMATION THEORY

DEEP LEARNING CHAPTER 3 PROBABILITY & INFORMATION THEORY DEEP LEARNING CHAPTER 3 PROBABILITY & INFORMATION THEORY OUTLINE 3.1 Why Probability? 3.2 Random Variables 3.3 Probability Distributions 3.4 Marginal Probability 3.5 Conditional Probability 3.6 The Chain

More information

Undirected Graphical Models: Markov Random Fields

Undirected Graphical Models: Markov Random Fields Undirected Graphical Models: Markov Random Fields 40-956 Advanced Topics in AI: Probabilistic Graphical Models Sharif University of Technology Soleymani Spring 2015 Markov Random Field Structure: undirected

More information

Relations Graphical View

Relations Graphical View Introduction Relations Computer Science & Engineering 235: Discrete Mathematics Christopher M. Bourke cbourke@cse.unl.edu Recall that a relation between elements of two sets is a subset of their Cartesian

More information

Chordal Coxeter Groups

Chordal Coxeter Groups arxiv:math/0607301v1 [math.gr] 12 Jul 2006 Chordal Coxeter Groups John Ratcliffe and Steven Tschantz Mathematics Department, Vanderbilt University, Nashville TN 37240, USA Abstract: A solution of the isomorphism

More information

Supplementary material to Structure Learning of Linear Gaussian Structural Equation Models with Weak Edges

Supplementary material to Structure Learning of Linear Gaussian Structural Equation Models with Weak Edges Supplementary material to Structure Learning of Linear Gaussian Structural Equation Models with Weak Edges 1 PRELIMINARIES Two vertices X i and X j are adjacent if there is an edge between them. A path

More information

MATH 829: Introduction to Data Mining and Analysis Graphical Models I

MATH 829: Introduction to Data Mining and Analysis Graphical Models I MATH 829: Introduction to Data Mining and Analysis Graphical Models I Dominique Guillot Departments of Mathematical Sciences University of Delaware May 2, 2016 1/12 Independence and conditional independence:

More information

Undirected Graphical Models

Undirected Graphical Models Undirected Graphical Models 1 Conditional Independence Graphs Let G = (V, E) be an undirected graph with vertex set V and edge set E, and let A, B, and C be subsets of vertices. We say that C separates

More information

Graphical models and causality: Directed acyclic graphs (DAGs) and conditional (in)dependence

Graphical models and causality: Directed acyclic graphs (DAGs) and conditional (in)dependence Graphical models and causality: Directed acyclic graphs (DAGs) and conditional (in)dependence General overview Introduction Directed acyclic graphs (DAGs) and conditional independence DAGs and causal effects

More information

Algebraic matroids are almost entropic

Algebraic matroids are almost entropic accepted to Proceedings of the AMS June 28, 2017 Algebraic matroids are almost entropic František Matúš Abstract. Algebraic matroids capture properties of the algebraic dependence among elements of extension

More information

Does Better Inference mean Better Learning?

Does Better Inference mean Better Learning? Does Better Inference mean Better Learning? Andrew E. Gelfand, Rina Dechter & Alexander Ihler Department of Computer Science University of California, Irvine {agelfand,dechter,ihler}@ics.uci.edu Abstract

More information

A graph contains a set of nodes (vertices) connected by links (edges or arcs)

A graph contains a set of nodes (vertices) connected by links (edges or arcs) BOLTZMANN MACHINES Generative Models Graphical Models A graph contains a set of nodes (vertices) connected by links (edges or arcs) In a probabilistic graphical model, each node represents a random variable,

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 7 Approximate

More information

Probability and Information Theory. Sargur N. Srihari

Probability and Information Theory. Sargur N. Srihari Probability and Information Theory Sargur N. srihari@cedar.buffalo.edu 1 Topics in Probability and Information Theory Overview 1. Why Probability? 2. Random Variables 3. Probability Distributions 4. Marginal

More information

1 Basic Combinatorics

1 Basic Combinatorics 1 Basic Combinatorics 1.1 Sets and sequences Sets. A set is an unordered collection of distinct objects. The objects are called elements of the set. We use braces to denote a set, for example, the set

More information

Machine Learning, Fall 2009: Midterm

Machine Learning, Fall 2009: Midterm 10-601 Machine Learning, Fall 009: Midterm Monday, November nd hours 1. Personal info: Name: Andrew account: E-mail address:. You are permitted two pages of notes and a calculator. Please turn off all

More information

Learning MN Parameters with Approximation. Sargur Srihari

Learning MN Parameters with Approximation. Sargur Srihari Learning MN Parameters with Approximation Sargur srihari@cedar.buffalo.edu 1 Topics Iterative exact learning of MN parameters Difficulty with exact methods Approximate methods Approximate Inference Belief

More information

Inferring the Causal Decomposition under the Presence of Deterministic Relations.

Inferring the Causal Decomposition under the Presence of Deterministic Relations. Inferring the Causal Decomposition under the Presence of Deterministic Relations. Jan Lemeire 1,2, Stijn Meganck 1,2, Francesco Cartella 1, Tingting Liu 1 and Alexander Statnikov 3 1-ETRO Department, Vrije

More information

Causal Effect Identification in Alternative Acyclic Directed Mixed Graphs

Causal Effect Identification in Alternative Acyclic Directed Mixed Graphs Proceedings of Machine Learning Research vol 73:21-32, 2017 AMBN 2017 Causal Effect Identification in Alternative Acyclic Directed Mixed Graphs Jose M. Peña Linköping University Linköping (Sweden) jose.m.pena@liu.se

More information

Total positivity in Markov structures

Total positivity in Markov structures 1 based on joint work with Shaun Fallat, Kayvan Sadeghi, Caroline Uhler, Nanny Wermuth, and Piotr Zwiernik (arxiv:1510.01290) Faculty of Science Total positivity in Markov structures Steffen Lauritzen

More information

Bayesian Learning in Undirected Graphical Models

Bayesian Learning in Undirected Graphical Models Bayesian Learning in Undirected Graphical Models Zoubin Ghahramani Gatsby Computational Neuroscience Unit University College London, UK http://www.gatsby.ucl.ac.uk/ and Center for Automated Learning and

More information

Partial cubes: structures, characterizations, and constructions

Partial cubes: structures, characterizations, and constructions Partial cubes: structures, characterizations, and constructions Sergei Ovchinnikov San Francisco State University, Mathematics Department, 1600 Holloway Ave., San Francisco, CA 94132 Abstract Partial cubes

More information

The Symmetric Groups

The Symmetric Groups Chapter 7 The Symmetric Groups 7. Introduction In the investigation of finite groups the symmetric groups play an important role. Often we are able to achieve a better understanding of a group if we can

More information

Undirected Graphical Models

Undirected Graphical Models Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Properties Properties 3 Generative vs. Conditional

More information

Combinatorial and physical content of Kirchhoff polynomials

Combinatorial and physical content of Kirchhoff polynomials Combinatorial and physical content of Kirchhoff polynomials Karen Yeats May 19, 2009 Spanning trees Let G be a connected graph, potentially with multiple edges and loops in the sense of a graph theorist.

More information

CONVENIENT PRETOPOLOGIES ON Z 2

CONVENIENT PRETOPOLOGIES ON Z 2 TWMS J. Pure Appl. Math., V.9, N.1, 2018, pp.40-51 CONVENIENT PRETOPOLOGIES ON Z 2 J. ŠLAPAL1 Abstract. We deal with pretopologies on the digital plane Z 2 convenient for studying and processing digital

More information

Probabilistic Graphical Models (I)

Probabilistic Graphical Models (I) Probabilistic Graphical Models (I) Hongxin Zhang zhx@cad.zju.edu.cn State Key Lab of CAD&CG, ZJU 2015-03-31 Probabilistic Graphical Models Modeling many real-world problems => a large number of random

More information

Tree-width and planar minors

Tree-width and planar minors Tree-width and planar minors Alexander Leaf and Paul Seymour 1 Princeton University, Princeton, NJ 08544 May 22, 2012; revised March 18, 2014 1 Supported by ONR grant N00014-10-1-0680 and NSF grant DMS-0901075.

More information

Junction Tree, BP and Variational Methods

Junction Tree, BP and Variational Methods Junction Tree, BP and Variational Methods Adrian Weller MLSALT4 Lecture Feb 21, 2018 With thanks to David Sontag (MIT) and Tony Jebara (Columbia) for use of many slides and illustrations For more information,

More information

Propositions of a thesis: Mathematical Modelling and Numerical Solution of Some 2D- and 3D-cases of Atmospheric Boundary Layer Flow

Propositions of a thesis: Mathematical Modelling and Numerical Solution of Some 2D- and 3D-cases of Atmospheric Boundary Layer Flow CZECH TECHNICAL UNIVERSITY Faculty of Nuclear Sciences and Physical Engineering Propositions of a thesis: Mathematical Modelling and Numerical Solution of Some 2D- and 3D-cases of Atmospheric Boundary

More information

Cocliques in the Kneser graph on line-plane flags in PG(4, q)

Cocliques in the Kneser graph on line-plane flags in PG(4, q) Cocliques in the Kneser graph on line-plane flags in PG(4, q) A. Blokhuis & A. E. Brouwer Abstract We determine the independence number of the Kneser graph on line-plane flags in the projective space PG(4,

More information

Finding normalized and modularity cuts by spectral clustering. Ljubjana 2010, October

Finding normalized and modularity cuts by spectral clustering. Ljubjana 2010, October Finding normalized and modularity cuts by spectral clustering Marianna Bolla Institute of Mathematics Budapest University of Technology and Economics marib@math.bme.hu Ljubjana 2010, October Outline Find

More information

Markov properties for undirected graphs

Markov properties for undirected graphs Graphical Models, Lecture 2, Michaelmas Term 2009 October 15, 2009 Formal definition Fundamental properties Random variables X and Y are conditionally independent given the random variable Z if L(X Y,

More information

32 Divisibility Theory in Integral Domains

32 Divisibility Theory in Integral Domains 3 Divisibility Theory in Integral Domains As we have already mentioned, the ring of integers is the prototype of integral domains. There is a divisibility relation on * : an integer b is said to be divisible

More information

Zero controllability in discrete-time structured systems

Zero controllability in discrete-time structured systems 1 Zero controllability in discrete-time structured systems Jacob van der Woude arxiv:173.8394v1 [math.oc] 24 Mar 217 Abstract In this paper we consider complex dynamical networks modeled by means of state

More information

SYMBOL EXPLANATION EXAMPLE

SYMBOL EXPLANATION EXAMPLE MATH 4310 PRELIM I REVIEW Notation These are the symbols we have used in class, leading up to Prelim I, and which I will use on the exam SYMBOL EXPLANATION EXAMPLE {a, b, c, } The is the way to write the

More information

Curve Fitting Re-visited, Bishop1.2.5

Curve Fitting Re-visited, Bishop1.2.5 Curve Fitting Re-visited, Bishop1.2.5 Maximum Likelihood Bishop 1.2.5 Model Likelihood differentiation p(t x, w, β) = Maximum Likelihood N N ( t n y(x n, w), β 1). (1.61) n=1 As we did in the case of the

More information

Codes on graphs. Chapter Elementary realizations of linear block codes

Codes on graphs. Chapter Elementary realizations of linear block codes Chapter 11 Codes on graphs In this chapter we will introduce the subject of codes on graphs. This subject forms an intellectual foundation for all known classes of capacity-approaching codes, including

More information

Chapter 8. P-adic numbers. 8.1 Absolute values

Chapter 8. P-adic numbers. 8.1 Absolute values Chapter 8 P-adic numbers Literature: N. Koblitz, p-adic Numbers, p-adic Analysis, and Zeta-Functions, 2nd edition, Graduate Texts in Mathematics 58, Springer Verlag 1984, corrected 2nd printing 1996, Chap.

More information

Integrating Correlated Bayesian Networks Using Maximum Entropy

Integrating Correlated Bayesian Networks Using Maximum Entropy Applied Mathematical Sciences, Vol. 5, 2011, no. 48, 2361-2371 Integrating Correlated Bayesian Networks Using Maximum Entropy Kenneth D. Jarman Pacific Northwest National Laboratory PO Box 999, MSIN K7-90

More information

Belief Update in CLG Bayesian Networks With Lazy Propagation

Belief Update in CLG Bayesian Networks With Lazy Propagation Belief Update in CLG Bayesian Networks With Lazy Propagation Anders L Madsen HUGIN Expert A/S Gasværksvej 5 9000 Aalborg, Denmark Anders.L.Madsen@hugin.com Abstract In recent years Bayesian networks (BNs)

More information

SOME DESIGNS AND CODES FROM L 2 (q) Communicated by Alireza Abdollahi

SOME DESIGNS AND CODES FROM L 2 (q) Communicated by Alireza Abdollahi Transactions on Combinatorics ISSN (print): 2251-8657, ISSN (on-line): 2251-8665 Vol. 3 No. 1 (2014), pp. 15-28. c 2014 University of Isfahan www.combinatorics.ir www.ui.ac.ir SOME DESIGNS AND CODES FROM

More information

A Solution of a Tropical Linear Vector Equation

A Solution of a Tropical Linear Vector Equation A Solution of a Tropical Linear Vector Equation NIKOLAI KRIVULIN Faculty of Mathematics and Mechanics St. Petersburg State University 28 Universitetsky Ave., St. Petersburg, 198504 RUSSIA nkk@math.spbu.ru

More information

Linear Algebra. Preliminary Lecture Notes

Linear Algebra. Preliminary Lecture Notes Linear Algebra Preliminary Lecture Notes Adolfo J. Rumbos c Draft date April 29, 23 2 Contents Motivation for the course 5 2 Euclidean n dimensional Space 7 2. Definition of n Dimensional Euclidean Space...........

More information

p L yi z n m x N n xi

p L yi z n m x N n xi y i z n x n N x i Overview Directed and undirected graphs Conditional independence Exact inference Latent variables and EM Variational inference Books statistical perspective Graphical Models, S. Lauritzen

More information

Graph coloring, perfect graphs

Graph coloring, perfect graphs Lecture 5 (05.04.2013) Graph coloring, perfect graphs Scribe: Tomasz Kociumaka Lecturer: Marcin Pilipczuk 1 Introduction to graph coloring Definition 1. Let G be a simple undirected graph and k a positive

More information

MATH 8253 ALGEBRAIC GEOMETRY WEEK 12

MATH 8253 ALGEBRAIC GEOMETRY WEEK 12 MATH 8253 ALGEBRAIC GEOMETRY WEEK 2 CİHAN BAHRAN 3.2.. Let Y be a Noetherian scheme. Show that any Y -scheme X of finite type is Noetherian. Moreover, if Y is of finite dimension, then so is X. Write f

More information

Directed acyclic graphs and the use of linear mixed models

Directed acyclic graphs and the use of linear mixed models Directed acyclic graphs and the use of linear mixed models Siem H. Heisterkamp 1,2 1 Groningen Bioinformatics Centre, University of Groningen 2 Biostatistics and Research Decision Sciences (BARDS), MSD,

More information

More Books At www.goalias.blogspot.com www.goalias.blogspot.com www.goalias.blogspot.com www.goalias.blogspot.com www.goalias.blogspot.com www.goalias.blogspot.com www.goalias.blogspot.com www.goalias.blogspot.com

More information

series. Utilize the methods of calculus to solve applied problems that require computational or algebraic techniques..

series. Utilize the methods of calculus to solve applied problems that require computational or algebraic techniques.. 1 Use computational techniques and algebraic skills essential for success in an academic, personal, or workplace setting. (Computational and Algebraic Skills) MAT 203 MAT 204 MAT 205 MAT 206 Calculus I

More information

The Expectation-Maximization Algorithm

The Expectation-Maximization Algorithm 1/29 EM & Latent Variable Models Gaussian Mixture Models EM Theory The Expectation-Maximization Algorithm Mihaela van der Schaar Department of Engineering Science University of Oxford MLE for Latent Variable

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Lecture 11 CRFs, Exponential Family CS/CNS/EE 155 Andreas Krause Announcements Homework 2 due today Project milestones due next Monday (Nov 9) About half the work should

More information

Notes. Relations. Introduction. Notes. Relations. Notes. Definition. Example. Slides by Christopher M. Bourke Instructor: Berthe Y.

Notes. Relations. Introduction. Notes. Relations. Notes. Definition. Example. Slides by Christopher M. Bourke Instructor: Berthe Y. Relations Slides by Christopher M. Bourke Instructor: Berthe Y. Choueiry Spring 2006 Computer Science & Engineering 235 Introduction to Discrete Mathematics Sections 7.1, 7.3 7.5 of Rosen cse235@cse.unl.edu

More information

Causality in Econometrics (3)

Causality in Econometrics (3) Graphical Causal Models References Causality in Econometrics (3) Alessio Moneta Max Planck Institute of Economics Jena moneta@econ.mpg.de 26 April 2011 GSBC Lecture Friedrich-Schiller-Universität Jena

More information

Tactical Decompositions of Steiner Systems and Orbits of Projective Groups

Tactical Decompositions of Steiner Systems and Orbits of Projective Groups Journal of Algebraic Combinatorics 12 (2000), 123 130 c 2000 Kluwer Academic Publishers. Manufactured in The Netherlands. Tactical Decompositions of Steiner Systems and Orbits of Projective Groups KELDON

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Brown University CSCI 2950-P, Spring 2013 Prof. Erik Sudderth Lecture 12: Gaussian Belief Propagation, State Space Models and Kalman Filters Guest Kalman Filter Lecture by

More information

arxiv: v3 [math.ds] 21 Jan 2018

arxiv: v3 [math.ds] 21 Jan 2018 ASYMPTOTIC BEHAVIOR OF CONJUNCTIVE BOOLEAN NETWORK OVER WEAKLY CONNECTED DIGRAPH XUDONG CHEN, ZUGUANG GAO, AND TAMER BAŞAR arxiv:1708.01975v3 [math.ds] 21 Jan 2018 Abstract. A conjunctive Boolean network

More information

An Introduction to Reversible Jump MCMC for Bayesian Networks, with Application

An Introduction to Reversible Jump MCMC for Bayesian Networks, with Application An Introduction to Reversible Jump MCMC for Bayesian Networks, with Application, CleverSet, Inc. STARMAP/DAMARS Conference Page 1 The research described in this presentation has been funded by the U.S.

More information

Diagnostic quiz for M303: Further Pure Mathematics

Diagnostic quiz for M303: Further Pure Mathematics Diagnostic quiz for M303: Further Pure Mathematics Am I ready to start M303? M303, Further pure mathematics is a fascinating third level module, that builds on topics introduced in our second level module

More information

Chapter 17: Undirected Graphical Models

Chapter 17: Undirected Graphical Models Chapter 17: Undirected Graphical Models The Elements of Statistical Learning Biaobin Jiang Department of Biological Sciences Purdue University bjiang@purdue.edu October 30, 2014 Biaobin Jiang (Purdue)

More information

Gaussian Mixture Models

Gaussian Mixture Models Gaussian Mixture Models David Rosenberg, Brett Bernstein New York University April 26, 2017 David Rosenberg, Brett Bernstein (New York University) DS-GA 1003 April 26, 2017 1 / 42 Intro Question Intro

More information