Introduction to Algebraic Statistics
|
|
- Suzanna Kelly Logan
- 6 years ago
- Views:
Transcription
1 Introduction to Algebraic Statistics Seth Sullivant North Carolina State University January 5, 2017 Seth Sullivant (NCSU) Algebraic Statistics January 5, / 28
2 What is Algebraic Statistics? Observation Statistical models are semialgebraic sets. Tools from algebraic geometry can uncover properties of statistical models. Algebro-geometric structure leads to new methods to analyze data. Algebraic properties of the model explain properties of inference procedures. Theoretical questions about statistical models lead to challenging problems in algebraic geometry. Seth Sullivant (NCSU) Algebraic Statistics January 5, / 28
3 Semialgebraic Sets R[p 1,..., p r ] = ring of polynomials in indeterminates p 1,..., p r with coefficients in R. Definition A basic semialgebraic set is the solution to finitely many polynomial equations and inequalities over R. That is, a set of the form: {a R r : f (a) = 0 for all f F, g(a) > 0 for all g G} where F, G R[p 1,..., p r ] are finite. A semialgebraic set is the union of finitely many basic semialgebraic sets. Semialgebraic sets are the basic objects of study in real algebraic geometry. Seth Sullivant (NCSU) Algebraic Statistics January 5, / 28
4 Semialgebraic Sets M 2 = {(p 0, p 1, p 2 ) R 3 : p 0 0, p 1 0, p 2 0, p 0 + p 1 + p 2 = 1, 4p 0 p 2 = p 2 1 } Theorem (Tarski-Seidenberg) The set of semialgebraic sets is closed under rational maps. That is, if S R r is a semialgebraic set, and φ : R r R s is a rational map then φ(s) R s is a semialgebraic set. Seth Sullivant (NCSU) Algebraic Statistics January 5, / 28
5 Statistical Models Definition A statistical model is a set of probability distributions. For this definition to be useful for us: Focus on a fixed state space. The set of distributions should be related to each other in a meaningful way. To keep things simple for this talk: State space will always be a finite set, like [r] := {1, 2,..., r}. In this case a probability distribution is just a point p R r : p = (p 1,..., p r ) = (P(X = 1),..., P(X = r)) R r satisfying r i=1 p i = 1 and p i 0 for all i [r]. Seth Sullivant (NCSU) Algebraic Statistics January 5, / 28
6 Example: Binomial Random Variable Model We have a biased coin with probability θ of getting a head. Flip the coin r times, and count the number of heads X. ( ) r P(X = i) = θ i (1 θ) r i i Binomial probability distribution is the point in R r+1. ((1 θ) r, rθ(1 θ) r 1,..., rθ r 1 (1 θ), θ r ) The binomial random variable model is the set {( ) } M r := (1 θ) r, rθ(1 θ) r 1,..., rθ r 1 (1 θ), θ r : θ [0, 1] M 2 = {(p 0, p 1, p 2 ) R 3 : p 0 0, p 1 0, p 2 0, p 0 + p 1 + p 2 = 1, 4p 0 p 2 = p1 2 } Seth Sullivant (NCSU) Algebraic Statistics January 5, / 28
7 Independence Model Definition Discrete random variables X and Y are independent if for all i and j Denote this X Y. P(X = i, Y = j) = P(X = i) P(Y = j) (1) Fix state space of X and Y as [r] and [s]. The model of independence M X Y Rr s is the set of all distribution satisfying (1). p 11 p 1s M X Y = p =..... : p satisfies (1) p r1 p rs Seth Sullivant (NCSU) Algebraic Statistics January 5, / 28
8 Independence Model Note that: means that p 11 p 1s p =..... p r1 p rs P(X = i, Y = j) = P(X = i) P(Y = j) = P(X = 1). P(X = r) ( P(Y = 1) P(Y = s) ) Proposition The independence model M X Y is the semialgebraic set {p R r s : p ij 0 for all i, j pij = 1, rank p = 1} Seth Sullivant (NCSU) Algebraic Statistics January 5, / 28
9 Statistical Models are Semialgebraic Sets Most statistical models are given parametrically as images of rational functions, where the domain is a polyhedron. φ : [0, 1] R r+1, θ ((1 ) θ) r, rθ(1 θ) r 1,..., rθ r 1 (1 θ), θ r Tarski-Seidenberg Theorem guarantees these models are semialgebraic sets. Problems to study with any family of models: Characterize the model implicitly. Characterize model properties. Use the algebraic structure to analyze procedures based on the model. Seth Sullivant (NCSU) Algebraic Statistics January 5, / 28
10 Phylogenetics Problem Given a collection of species, find the tree that explains their history. Data consists of aligned DNA sequences from homologous genes Human: Chimp: Gorilla: Seth Sullivant (NCSU)...ACCGTGCAACGTGAACGA......ACCTTGGAAGGTAAACGA......ACCGTGCAACGTAAACTA... Algebraic Statistics January 5, / 28
11 Model-Based Phylogenetics Use a probabilistic model of mutations Parameters for the model are the combinatorial tree T, and rate parameters for mutations on each edge Models give a probability for observing a particular aligned collection of DNA sequences Human: Chimp: Gorilla: ACCGTGCAACGTGAACGA ACGTTGCAAGGTAAACGA ACCGTGCAACGTAAACTA Assuming site independence, data is summarized by empirical distribution of columns in the alignment. e.g. ˆp(AAA) = 6 18, ˆp(CGC) = 2 18, etc. Use empirical distribution and test statistic to find tree best explaining data Seth Sullivant (NCSU) Algebraic Statistics January 5, / 28
12 Phylogenetic Models Assuming site independence: Phylogenetic Model is a latent class graphical model Leaf v T is random variable X v {A, C, G, T}. Internal nodes v T are latent random variables Y v X 1 X 2 X 3 Y 2 Y 1 P(x 1, x 2, x 3 ) = y 1 y 2 P 1 (y 1 )P 2 (y 2 y 1 )P 3 (x 1 y 1 )P 4 (x 2 y 2 )P 5 (x 3 y 2 ) Seth Sullivant (NCSU) Algebraic Statistics January 5, / 28
13 Phylogenetic Models Assuming site independence: Phylogenetic Model is a latent class graphical model Leaf v T is random variable X v {A, C, G, T}. Internal nodes v T are latent random variables Y v X 1 X 2 X 3 Y 2 Y 1 p i1 i 2 i 3 = j 1 j 2 π j1 a j2,j 1 b i1,j 1 c i2,j 2 d i3,j 2 Seth Sullivant (NCSU) Algebraic Statistics January 5, / 28
14 Algebraic Perspective on Phylogenetic Models Once we fix a tree T with n leaves and model structure, we get a map φ T : Θ R 4n. Θ R d is a parameter space of numerical parameters (transition matrices associated to each edge). The map φ T is given by polynomial functions of the parameters. For each i 1 i n {A, C, G, T } n, φ T i 1 i n (θ) gives the probability of the column (i 1,..., i n ) in the alignment for the particular parameter choice θ. φ T i 1 i 2 i 3 (π, a, b, c, d) = π j1 a j2,j 1 b i1,j 1 c i2,j 2 d i3,j 2 j 1 j 2 The phylogenetic model is the set M T = φ T (Θ) R 4n. Seth Sullivant (NCSU) Algebraic Statistics January 5, / 28
15 Phylogenetic Varieties and Phylogenetic Invariants Let R[p] := R[p i1 i n : i 1 i n {A, C, G, T } n ] Definition Let I T := f R[p] : f (p) = 0 for all p M T R[p]. I T is the ideal of phylogenetic invariants of T. Let V T := {p R 4n : f (p) = 0 for all f I T }. V T is the phylogenetic variety of T. Note that M T V T. Since M T is image of a polynomial map dim M T = dim V T. Seth Sullivant (NCSU) Algebraic Statistics January 5, / 28
16 X 1 X X X p lmno = T T T π i a ij b ik c jl d jm e kn f ko i=a j=a k=a Seth Sullivant (NCSU) Algebraic Statistics January 5, / 28
17 X 1 X X X p lmno = T T T π i a ij b ik c jl d jm e kn f ko i=a j=a k=a T a ij c jl d jm j=a Seth Sullivant (NCSU) Algebraic Statistics January 5, / 28
18 X 1 X X X p lmno = T T T π i a ij b ik c jl d jm e kn f ko i=a j=a k=a T T a ij c jl d jm b ik e kn f ko j=a k =A Seth Sullivant (NCSU) Algebraic Statistics January 5, / 28
19 X 1 X X X p lmno = = T T T π i a ij b ik c jl d jm e kn f ko i=a j=a k=a T T T a ij c jl d jm b ik e kn f ko π i i=a j=a k =A Seth Sullivant (NCSU) Algebraic Statistics January 5, / 28
20 X 1 X X X p lmno = = T T T π i a ij b ik c jl d jm e kn f ko i=a j=a k=a T T T a ij c jl d jm b ik e kn f ko π i i=a j=a k =A p AAAA p AAAC p AATT p = rank ACAA p ACAC p ACTT p TTAA p TTAC p TTTT Seth Sullivant (NCSU) Algebraic Statistics January 5, / 28
21 Splits and Phylogenetic Invariants Definition A split of a set is a bipartition A B. A split A B of the leaves of a tree T is valid for T if the induced trees T A and T B do not intersect. X 1 X X X Valid: Not Valid: Seth Sullivant (NCSU) Algebraic Statistics January 5, / 28
22 2-way Flattenings and Matrix Ranks Proposition Let P M T. p ijkl = P(X 1 = i, X 2 = j, X 3 = k, X 4 = l) p AAAA p AAAC p AAAG p AATT p ACAA p ACAC p ACAG p ACTT Flat (P) = p TTAA p TTAC p TTAG p TTTT If A B is a valid split for T, then rank(flat A B (P)) 4. Invariants in I T are subdeterminants of Flat A B (P). If C D is not a valid split for T, then generically rank(flat C D (P)) > 4. Seth Sullivant (NCSU) Algebraic Statistics January 5, / 28
23 Using Matrix Flattenings Proposition The set of valid splits for a given tree uniquely determine the tree. Proposition Let P M T. If A B is a valid split for T, then rank(flat A B (P)) 4. Invariants in I T are subdeterminants of Flat A B (P). If C D is not a valid split for T, then generically rank(flat C D (P)) > 4. Can use flattening invariants to reconstruct phylogenetic trees (Eriksson 2005, Chifman-Kubatko 2014, Allman-Kubatko-Rhodes 2016) via the SVD. Seth Sullivant (NCSU) Algebraic Statistics January 5, / 28
24 Identifiability of Phylogenetic Models Definition A parametric statistical model is identifiable if it gives 1-to-1 map from parameters to probability distributions. Is it possible to infer the parameters of the model from data? Identifiability guarantees consistency of statistical methods (ML) Two types of parameters to consider for phylogenetic models: Numerical parameters (transition matrices) Tree parameter (combinatorial type of tree) Seth Sullivant (NCSU) Algebraic Statistics January 5, / 28
25 Generic Identifiability Definition The tree parameter in a phylogenetic model is generically identifiable if for all n-leaf trees with T T, dim(m T M T ) < min(dim(m T ), dim(m T )). M1 M2 M3 Seth Sullivant (NCSU) Algebraic Statistics January 5, / 28
26 Proving Identifiability with Algebraic Geometry Proposition Let M 0 and M 1 be two algebraic models. If there exist phylogenetically informative invariants f 0 and f 1 such that f i (p) = 0 for all p M i, and f i (q) 0 for some q M 1 i, then dim(m 0 M 1 ) < min(dim M 0, dim M 1 ). M0 f = 0 q M 1 Seth Sullivant (NCSU) Algebraic Statistics January 5, / 28
27 Phylogenetic Models are Identifiable Theorem The unrooted tree parameter of phylogenetic models is generically identifiable. Proof. Edge flattening invariants can detect which splits are implied by a specific distribution in M T. The splits in T uniquely determine T. These methods can be generalized to much more complicated settings: Phylogenetic mixture models (Allman-Petrovic-Rhodes-Sullivant 2009, Rhodes-Sullivant 2012, Long-Sullivant 2015) Species tree problems (Allman-Degnan-Rhodes 2011, Chifman-Kubatko 2014) K-mer methods (Allman-Rhodes-Sullivant 2016, Durden-Sullivant 201?) Seth Sullivant (NCSU) Algebraic Statistics January 5, / 28
28 Maximum Likelihood Estimation Let M R r a statistical model for discrete random variables. We collect i.i.d. data, X (1),..., X (n), we believe to be generated from a distribution in M. Let u = (u 1,..., u r ) be the vector of counts: u i = #{j : X (j) = i} Given the data u, the likelihood function is the monomial p u = p u 1 1 pu 2 2 pur r which is the probability of observing the data when P(X = i) = p i. The maximum likelihood estimate (MLE) of p given u is: argmax p u such that p M Seth Sullivant (NCSU) Algebraic Statistics January 5, / 28
29 Maximum Likelihood Degree Computing the maximum likelihood estimate argmax p u such that p M is an optimization problem of a polynomial constrained to a semialgebraic set. We can extended to the Zariski closure V = M, in which case the optimum should be a critical point of the likelihood function (for generic u). Theorem (Catanese-Hoşten-Khetan-Sturmfels 2006) The number of complex critical points of the likelihood function on V sm = V \ V sing is constant for generic u. This number if called the maximum likelihood degree of the variety V (or model M). Seth Sullivant (NCSU) Algebraic Statistics January 5, / 28
30 Independence Model Consider M X Y. Maximum likelihood estimation is: argmax i,j p u ij ij such that p M X Y But p M X Y iff there are α r, β s with p ij = α i β j. Becomes: argmax α u i+ i β u +j i such that α r, β s i j Maximum likelihood estimate is ˆp ij = u i+u +j u 2 ++ = u i+ u ++ u+j u ++ M X Y has ML-degree 1 (= MLE is rational function of the data). Seth Sullivant (NCSU) Algebraic Statistics January 5, / 28
31 ML-degree and Algebraic Geometry Theorem (Huh 2014) A variety V has maximum likelihood degree 1 if and only if it is a reduced A-discriminant. Theorem (Huh 2012) Suppose that V (C ) r is a smooth variety. Then the ML-degree of V is the reduced Euler characteristic of V (C ) r. When is a reduced A-discriminant a statistical model? Does large ML-degree affect algorithms for computing MLEs? Seth Sullivant (NCSU) Algebraic Statistics January 5, / 28
32 Further Applications of Algebraic Statistics Fisher s Exact Test (Diaconis-Sturmfels 1998) (Non)existence of maximum likelihood estimates (Fienberg) Bayesian integrals and information criteria (Watanabe 2009) Conditional independence structures (Studeny 2004) Graphical models (Lauritzen 1996) Parametric inference (Pachter-Sturmfels 2004) Learn More: Algebraic Statistics the book Draft version available at my website. Comments welcome! Seth Sullivant (NCSU) Algebraic Statistics January 5, / 28
Phylogenetic Algebraic Geometry
Phylogenetic Algebraic Geometry Seth Sullivant North Carolina State University January 4, 2012 Seth Sullivant (NCSU) Phylogenetic Algebraic Geometry January 4, 2012 1 / 28 Phylogenetics Problem Given a
More informationAlgebraic Statistics Tutorial I
Algebraic Statistics Tutorial I Seth Sullivant North Carolina State University June 9, 2012 Seth Sullivant (NCSU) Algebraic Statistics June 9, 2012 1 / 34 Introduction to Algebraic Geometry Let R[p] =
More informationExample: Hardy-Weinberg Equilibrium. Algebraic Statistics Tutorial I. Phylogenetics. Main Point of This Tutorial. Model-Based Phylogenetics
Example: Hardy-Weinberg Equilibrium Algebraic Statistics Tutorial I Seth Sullivant North Carolina State University July 22, 2012 Suppose a gene has two alleles, a and A. If allele a occurs in the population
More informationPhylogenetic invariants versus classical phylogenetics
Phylogenetic invariants versus classical phylogenetics Marta Casanellas Rius (joint work with Jesús Fernández-Sánchez) Departament de Matemàtica Aplicada I Universitat Politècnica de Catalunya Algebraic
More informationOpen Problems in Algebraic Statistics
Open Problems inalgebraic Statistics p. Open Problems in Algebraic Statistics BERND STURMFELS UNIVERSITY OF CALIFORNIA, BERKELEY and TECHNISCHE UNIVERSITÄT BERLIN Advertisement Oberwolfach Seminar Algebraic
More informationSpecies Tree Inference using SVDquartets
Species Tree Inference using SVDquartets Laura Kubatko and Dave Swofford May 19, 2015 Laura Kubatko SVDquartets May 19, 2015 1 / 11 SVDquartets In this tutorial, we ll discuss several different data types:
More informationThe Generalized Neighbor Joining method
The Generalized Neighbor Joining method Ruriko Yoshida Dept. of Mathematics Duke University Joint work with Dan Levy and Lior Pachter www.math.duke.edu/ ruriko data mining 1 Challenge We would like to
More informationUsing algebraic geometry for phylogenetic reconstruction
Using algebraic geometry for phylogenetic reconstruction Marta Casanellas i Rius (joint work with Jesús Fernández-Sánchez) Departament de Matemàtica Aplicada I Universitat Politècnica de Catalunya IMA
More informationJed Chou. April 13, 2015
of of CS598 AGB April 13, 2015 Overview of 1 2 3 4 5 Competing Approaches of Two competing approaches to species tree inference: Summary methods: estimate a tree on each gene alignment then combine gene
More informationALGEBRAIC STATISTICAL MODELS
Statistica Sinica 7(007), 73-97 ALGEBRAIC STATISTICAL MODELS Mathias Drton and Seth Sullivant University of Chicago and Harvard University Abstract: Many statistical models are algebraic in that they are
More informationmathematics Smoothness in Binomial Edge Ideals Article Hamid Damadi and Farhad Rahmati *
mathematics Article Smoothness in Binomial Edge Ideals Hamid Damadi and Farhad Rahmati * Faculty of Mathematics and Computer Science, Amirkabir University of Technology (Tehran Polytechnic), 424 Hafez
More informationMaximum Likelihood Estimates for Binary Random Variables on Trees via Phylogenetic Ideals
Maximum Likelihood Estimates for Binary Random Variables on Trees via Phylogenetic Ideals Robin Evans Abstract In their 2007 paper, E.S. Allman and J.A. Rhodes characterise the phylogenetic ideal of general
More informationPHYLOGENETIC ALGEBRAIC GEOMETRY
PHYLOGENETIC ALGEBRAIC GEOMETRY NICHOLAS ERIKSSON, KRISTIAN RANESTAD, BERND STURMFELS, AND SETH SULLIVANT Abstract. Phylogenetic algebraic geometry is concerned with certain complex projective algebraic
More informationThe Maximum Likelihood Threshold of a Graph
The Maximum Likelihood Threshold of a Graph Elizabeth Gross and Seth Sullivant San Jose State University, North Carolina State University August 28, 2014 Seth Sullivant (NCSU) Maximum Likelihood Threshold
More informationNaïve Bayes classification
Naïve Bayes classification 1 Probability theory Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. Examples: A person s height, the outcome of a coin toss
More information19 Tree Construction using Singular Value Decomposition
19 Tree Construction using Singular Value Decomposition Nicholas Eriksson We present a new, statistically consistent algorithm for phylogenetic tree construction that uses the algeraic theory of statistical
More informationarxiv: v1 [math.st] 22 Jun 2018
Hypothesis testing near singularities and boundaries arxiv:1806.08458v1 [math.st] Jun 018 Jonathan D. Mitchell, Elizabeth S. Allman, and John A. Rhodes Department of Mathematics & Statistics University
More informationThe maximum likelihood degree of rank 2 matrices via Euler characteristics
The maximum likelihood degree of rank 2 matrices via Euler characteristics Jose Israel Rodriguez University of Notre Dame Joint work with Botong Wang Algebraic Statistics 2015 University of Genoa June
More informationNaïve Bayes classification. p ij 11/15/16. Probability theory. Probability theory. Probability theory. X P (X = x i )=1 i. Marginal Probability
Probability theory Naïve Bayes classification Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. s: A person s height, the outcome of a coin toss Distinguish
More informationReadings: K&F: 16.3, 16.4, Graphical Models Carlos Guestrin Carnegie Mellon University October 6 th, 2008
Readings: K&F: 16.3, 16.4, 17.3 Bayesian Param. Learning Bayesian Structure Learning Graphical Models 10708 Carlos Guestrin Carnegie Mellon University October 6 th, 2008 10-708 Carlos Guestrin 2006-2008
More informationToric Fiber Products
Toric Fiber Products Seth Sullivant North Carolina State University June 8, 2011 Seth Sullivant (NCSU) Toric Fiber Products June 8, 2011 1 / 26 Families of Ideals Parametrized by Graphs Let G be a finite
More informationMetric learning for phylogenetic invariants
Metric learning for phylogenetic invariants Nicholas Eriksson nke@stanford.edu Department of Statistics, Stanford University, Stanford, CA 94305-4065 Yuan Yao yuany@math.stanford.edu Department of Mathematics,
More informationPhylogenetic Geometry
Phylogenetic Geometry Ruth Davidson University of Illinois Urbana-Champaign Department of Mathematics Mathematics and Statistics Seminar Washington State University-Vancouver September 26, 2016 Phylogenies
More informationALGEBRAIC DEGREE OF POLYNOMIAL OPTIMIZATION. 1. Introduction. f 0 (x)
ALGEBRAIC DEGREE OF POLYNOMIAL OPTIMIZATION JIAWANG NIE AND KRISTIAN RANESTAD Abstract. Consider the polynomial optimization problem whose objective and constraints are all described by multivariate polynomials.
More informationIdentifiability of the GTR+Γ substitution model (and other models) of DNA evolution
Identifiability of the GTR+Γ substitution model (and other models) of DNA evolution Elizabeth S. Allman Dept. of Mathematics and Statistics University of Alaska Fairbanks TM Current Challenges and Problems
More informationLecture 4 September 15
IFT 6269: Probabilistic Graphical Models Fall 2017 Lecture 4 September 15 Lecturer: Simon Lacoste-Julien Scribe: Philippe Brouillard & Tristan Deleu 4.1 Maximum Likelihood principle Given a parametric
More informationGeometry of Phylogenetic Inference
Geometry of Phylogenetic Inference Matilde Marcolli CS101: Mathematical and Computational Linguistics Winter 2015 References N. Eriksson, K. Ranestad, B. Sturmfels, S. Sullivant, Phylogenetic algebraic
More informationIntroduction: MLE, MAP, Bayesian reasoning (28/8/13)
STA561: Probabilistic machine learning Introduction: MLE, MAP, Bayesian reasoning (28/8/13) Lecturer: Barbara Engelhardt Scribes: K. Ulrich, J. Subramanian, N. Raval, J. O Hollaren 1 Classifiers In this
More informationAlgebraic Invariants of Phylogenetic Trees
Harvey Mudd College November 21st, 2006 The Problem DNA Sequence Data Given a set of aligned DNA sequences... Human: A G A C G T T A C G T A... Chimp: A G A G C A A C T T T G... Monkey: A A A C G A T A
More informationIntroduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Lior Wolf
1 Introduction to Machine Learning Maximum Likelihood and Bayesian Inference Lecturers: Eran Halperin, Lior Wolf 2014-15 We know that X ~ B(n,p), but we do not know p. We get a random sample from X, a
More informationLecture 4: Hidden Markov Models: An Introduction to Dynamic Decision Making. November 11, 2010
Hidden Lecture 4: Hidden : An Introduction to Dynamic Decision Making November 11, 2010 Special Meeting 1/26 Markov Model Hidden When a dynamical system is probabilistic it may be determined by the transition
More informationIntroductory Notes to Algebraic Statistics
Rend. Istit. Mat. Univ. Trieste Vol. XXXVII, 39 70 (2005) Introductory Notes to Algebraic Statistics Serkan Hoşten and Suela Ruffa ( ) Contribution to School (and Workshop) on Computational Algebra for
More informationAlgebraic Complexity in Statistics using Combinatorial and Tensor Methods
Algebraic Complexity in Statistics using Combinatorial and Tensor Methods BY ELIZABETH GROSS THESIS Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Mathematics
More informationCPSC 340: Machine Learning and Data Mining
CPSC 340: Machine Learning and Data Mining MLE and MAP Original version of these slides by Mark Schmidt, with modifications by Mike Gelbart. 1 Admin Assignment 4: Due tonight. Assignment 5: Will be released
More informationDiffusion Models in Population Genetics
Diffusion Models in Population Genetics Laura Kubatko kubatko.2@osu.edu MBI Workshop on Spatially-varying stochastic differential equations, with application to the biological sciences July 10, 2015 Laura
More informationMaximum likelihood for dual varieties
Maximum likelihood for dual varieties Jose Israel Rodriguez University of Notre Dame SIAM AG15 August 4, 2015 A homogeneous formulation Xiaoxian provided an affine square formulation, here is a homog.
More informationPhylogenetics: Likelihood
1 Phylogenetics: Likelihood COMP 571 Luay Nakhleh, Rice University The Problem 2 Input: Multiple alignment of a set S of sequences Output: Tree T leaf-labeled with S Assumptions 3 Characters are mutually
More informationCS839: Probabilistic Graphical Models. Lecture 7: Learning Fully Observed BNs. Theo Rekatsinas
CS839: Probabilistic Graphical Models Lecture 7: Learning Fully Observed BNs Theo Rekatsinas 1 Exponential family: a basic building block For a numeric random variable X p(x ) =h(x)exp T T (x) A( ) = 1
More informationNaïve Bayes. Jia-Bin Huang. Virginia Tech Spring 2019 ECE-5424G / CS-5824
Naïve Bayes Jia-Bin Huang ECE-5424G / CS-5824 Virginia Tech Spring 2019 Administrative HW 1 out today. Please start early! Office hours Chen: Wed 4pm-5pm Shih-Yang: Fri 3pm-4pm Location: Whittemore 266
More informationFundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner
Fundamentals CS 281A: Statistical Learning Theory Yangqing Jia Based on tutorial slides by Lester Mackey and Ariel Kleiner August, 2011 Outline 1 Probability 2 Statistics 3 Linear Algebra 4 Optimization
More informationThe Geometry of Semidefinite Programming. Bernd Sturmfels UC Berkeley
The Geometry of Semidefinite Programming Bernd Sturmfels UC Berkeley Positive Semidefinite Matrices For a real symmetric n n-matrix A the following are equivalent: All n eigenvalues of A are positive real
More informationThe intersection axiom of
The intersection axiom of conditional independence: some new [?] results Richard D. Gill Mathematical Institute, University Leiden This version: 26 March, 2019 (X Y Z) & (X Z Y) X (Y, Z) Presented at Algebraic
More informationWhat is Singular Learning Theory?
What is Singular Learning Theory? Shaowei Lin (UC Berkeley) shaowei@math.berkeley.edu 23 Sep 2011 McGill University Singular Learning Theory A statistical model is regular if it is identifiable and its
More informationProbabilistic modeling. The slides are closely adapted from Subhransu Maji s slides
Probabilistic modeling The slides are closely adapted from Subhransu Maji s slides Overview So far the models and algorithms you have learned about are relatively disconnected Probabilistic modeling framework
More informationBayesian Learning (II)
Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen Bayesian Learning (II) Niels Landwehr Overview Probabilities, expected values, variance Basic concepts of Bayesian learning MAP
More informationSolving the 100 Swiss Francs Problem
Solving the 00 Swiss Francs Problem Mingfu Zhu, Guangran Jiang and Shuhong Gao Abstract. Sturmfels offered 00 Swiss Francs in 00 to a conjecture, which deals with a special case of the maximum likelihood
More informationQuartet Inference from SNP Data Under the Coalescent Model
Bioinformatics Advance Access published August 7, 2014 Quartet Inference from SNP Data Under the Coalescent Model Julia Chifman 1 and Laura Kubatko 2,3 1 Department of Cancer Biology, Wake Forest School
More informationModeling Environment
Topic Model Modeling Environment What does it mean to understand/ your environment? Ability to predict Two approaches to ing environment of words and text Latent Semantic Analysis (LSA) Topic Model LSA
More informationA Potpourri of Nonlinear Algebra
a a 2 a 3 + c c 2 c 3 =2, a a 3 b 2 + c c 3 d 2 =0, a 2 a 3 b + c 2 c 3 d =0, a 3 b b 2 + c 3 d d 2 = 4, a a 2 b 3 + c c 2 d 3 =0, a b 2 b 3 + c d 2 d 3 = 4, a 2 b b 3 + c 2 d 3 d =4, b b 2 b 3 + d d 2
More informationIs Every Secant Variety of a Segre Product Arithmetically Cohen Macaulay? Oeding (Auburn) acm Secants March 6, / 23
Is Every Secant Variety of a Segre Product Arithmetically Cohen Macaulay? Luke Oeding Auburn University Oeding (Auburn) acm Secants March 6, 2016 1 / 23 Secant varieties and tensors Let V 1,..., V d, be
More information9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering
Types of learning Modeling data Supervised: we know input and targets Goal is to learn a model that, given input data, accurately predicts target data Unsupervised: we know the input only and want to make
More informationWhen does a mixture of products contain a product of mixtures?
When does a mixture of products contain a product of mixtures? Jason Morton Penn State May 19, 2014 Algebraic Statistics 2014 IIT Joint work with Guido Montufar Supported by DARPA FA8650-11-1-7145 Jason
More informationLikelihood, MLE & EM for Gaussian Mixture Clustering. Nick Duffield Texas A&M University
Likelihood, MLE & EM for Gaussian Mixture Clustering Nick Duffield Texas A&M University Probability vs. Likelihood Probability: predict unknown outcomes based on known parameters: P(x q) Likelihood: estimate
More informationBayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework
HT5: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Maximum Likelihood Principle A generative model for
More informationIntroduction to Probability and Statistics (Continued)
Introduction to Probability and Statistics (Continued) Prof. icholas Zabaras Center for Informatics and Computational Science https://cics.nd.edu/ University of otre Dame otre Dame, Indiana, USA Email:
More informationClustering K-means. Clustering images. Machine Learning CSE546 Carlos Guestrin University of Washington. November 4, 2014.
Clustering K-means Machine Learning CSE546 Carlos Guestrin University of Washington November 4, 2014 1 Clustering images Set of Images [Goldberger et al.] 2 1 K-means Randomly initialize k centers µ (0)
More informationA new parametrization for binary hidden Markov modes
A new parametrization for binary hidden Markov models Andrew Critch, UC Berkeley at Pennsylvania State University June 11, 2012 See Binary hidden Markov models and varieties [, 2012], arxiv:1206.0500,
More informationPROBLEMS FOR VIASM MINICOURSE: GEOMETRY OF MODULI SPACES LAST UPDATED: DEC 25, 2013
PROBLEMS FOR VIASM MINICOURSE: GEOMETRY OF MODULI SPACES LAST UPDATED: DEC 25, 2013 1. Problems on moduli spaces The main text for this material is Harris & Morrison Moduli of curves. (There are djvu files
More informationBayesian Methods. David S. Rosenberg. New York University. March 20, 2018
Bayesian Methods David S. Rosenberg New York University March 20, 2018 David S. Rosenberg (New York University) DS-GA 1003 / CSCI-GA 2567 March 20, 2018 1 / 38 Contents 1 Classical Statistics 2 Bayesian
More informationLecture 2: Simple Classifiers
CSC 412/2506 Winter 2018 Probabilistic Learning and Reasoning Lecture 2: Simple Classifiers Slides based on Rich Zemel s All lecture slides will be available on the course website: www.cs.toronto.edu/~jessebett/csc412
More informationAdditive distances. w(e), where P ij is the path in T from i to j. Then the matrix [D ij ] is said to be additive.
Additive distances Let T be a tree on leaf set S and let w : E R + be an edge-weighting of T, and assume T has no nodes of degree two. Let D ij = e P ij w(e), where P ij is the path in T from i to j. Then
More informationSemidefinite Programming
Semidefinite Programming Notes by Bernd Sturmfels for the lecture on June 26, 208, in the IMPRS Ringvorlesung Introduction to Nonlinear Algebra The transition from linear algebra to nonlinear algebra has
More informationADVANCED TOPICS IN ALGEBRAIC GEOMETRY
ADVANCED TOPICS IN ALGEBRAIC GEOMETRY DAVID WHITE Outline of talk: My goal is to introduce a few more advanced topics in algebraic geometry but not to go into too much detail. This will be a survey of
More informationTesting Algebraic Hypotheses
Testing Algebraic Hypotheses Mathias Drton Department of Statistics University of Chicago 1 / 18 Example: Factor analysis Multivariate normal model based on conditional independence given hidden variable:
More informationPhylogeny of Mixture Models
Phylogeny of Mixture Models Daniel Štefankovič Department of Computer Science University of Rochester joint work with Eric Vigoda College of Computing Georgia Institute of Technology Outline Introduction
More informationBayesian Methods: Naïve Bayes
Bayesian Methods: aïve Bayes icholas Ruozzi University of Texas at Dallas based on the slides of Vibhav Gogate Last Time Parameter learning Learning the parameter of a simple coin flipping model Prior
More informationCOMP90051 Statistical Machine Learning
COMP90051 Statistical Machine Learning Semester 2, 2017 Lecturer: Trevor Cohn 2. Statistical Schools Adapted from slides by Ben Rubinstein Statistical Schools of Thought Remainder of lecture is to provide
More informationPhylogenetics: Parsimony and Likelihood. COMP Spring 2016 Luay Nakhleh, Rice University
Phylogenetics: Parsimony and Likelihood COMP 571 - Spring 2016 Luay Nakhleh, Rice University The Problem Input: Multiple alignment of a set S of sequences Output: Tree T leaf-labeled with S Assumptions
More informationClustering K-means. Machine Learning CSE546. Sham Kakade University of Washington. November 15, Review: PCA Start: unsupervised learning
Clustering K-means Machine Learning CSE546 Sham Kakade University of Washington November 15, 2016 1 Announcements: Project Milestones due date passed. HW3 due on Monday It ll be collaborative HW2 grades
More informationTotal binomial decomposition (TBD) Thomas Kahle Otto-von-Guericke Universität Magdeburg
Total binomial decomposition (TBD) Thomas Kahle Otto-von-Guericke Universität Magdeburg Setup Let k be a field. For computations we use k = Q. k[p] := k[p 1,..., p n ] the polynomial ring in n indeterminates
More informationCPSC 340: Machine Learning and Data Mining. MLE and MAP Fall 2017
CPSC 340: Machine Learning and Data Mining MLE and MAP Fall 2017 Assignment 3: Admin 1 late day to hand in tonight, 2 late days for Wednesday. Assignment 4: Due Friday of next week. Last Time: Multi-Class
More informationIntroduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf
1 Introduction to Machine Learning Maximum Likelihood and Bayesian Inference Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf 2013-14 We know that X ~ B(n,p), but we do not know p. We get a random sample
More informationReview: Probability. BM1: Advanced Natural Language Processing. University of Potsdam. Tatjana Scheffler
Review: Probability BM1: Advanced Natural Language Processing University of Potsdam Tatjana Scheffler tatjana.scheffler@uni-potsdam.de October 21, 2016 Today probability random variables Bayes rule expectation
More informationProbability and Estimation. Alan Moses
Probability and Estimation Alan Moses Random variables and probability A random variable is like a variable in algebra (e.g., y=e x ), but where at least part of the variability is taken to be stochastic.
More informationGeometry of Gaussoids
Geometry of Gaussoids Bernd Sturmfels MPI Leipzig and UC Berkeley p 3 p 13 p 23 a 12 3 p 123 a 23 a 13 2 a 23 1 a 13 p 2 p 12 a 12 p p 1 Figure 1: With The vertices Tobias andboege, 2-faces ofalessio the
More informationAlgebraic Statistics. Seth Sullivant. North Carolina State University address:
Algebraic Statistics Seth Sullivant North Carolina State University E-mail address: smsulli2@ncsu.edu 2010 Mathematics Subject Classification. Primary 62-01, 14-01, 13P10, 13P15, 14M12, 14M25, 14P10, 14T05,
More informationBayesian Analysis for Natural Language Processing Lecture 2
Bayesian Analysis for Natural Language Processing Lecture 2 Shay Cohen February 4, 2013 Administrativia The class has a mailing list: coms-e6998-11@cs.columbia.edu Need two volunteers for leading a discussion
More informationParametric Models: from data to models
Parametric Models: from data to models Pradeep Ravikumar Co-instructor: Manuela Veloso Machine Learning 10-701 Jan 22, 2018 Recall: Model-based ML DATA MODEL LEARNING MODEL MODEL INFERENCE KNOWLEDGE Learning:
More informationMaximum likelihood estimation
Maximum likelihood estimation Guillaume Obozinski Ecole des Ponts - ParisTech Master MVA Maximum likelihood estimation 1/26 Outline 1 Statistical concepts 2 A short review of convex analysis and optimization
More informationGraphical Models for Collaborative Filtering
Graphical Models for Collaborative Filtering Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Sequence modeling HMM, Kalman Filter, etc.: Similarity: the same graphical model topology,
More informationDS-GA 1002 Lecture notes 11 Fall Bayesian statistics
DS-GA 100 Lecture notes 11 Fall 016 Bayesian statistics In the frequentist paradigm we model the data as realizations from a distribution that depends on deterministic parameters. In contrast, in Bayesian
More informationBMI/CS 776 Lecture 4. Colin Dewey
BMI/CS 776 Lecture 4 Colin Dewey 2007.02.01 Outline Common nucleotide substitution models Directed graphical models Ancestral sequence inference Poisson process continuous Markov process X t0 X t1 X t2
More informationLecture 21: Spectral Learning for Graphical Models
10-708: Probabilistic Graphical Models 10-708, Spring 2016 Lecture 21: Spectral Learning for Graphical Models Lecturer: Eric P. Xing Scribes: Maruan Al-Shedivat, Wei-Cheng Chang, Frederick Liu 1 Motivation
More informationSheaf cohomology and non-normal varieties
Sheaf cohomology and non-normal varieties Steven Sam Massachusetts Institute of Technology December 11, 2011 1/14 Kempf collapsing We re interested in the following situation (over a field K): V is a vector
More informationProbabilistic Graphical Models
Probabilistic Graphical Models Lecture 5 Bayesian Learning of Bayesian Networks CS/CNS/EE 155 Andreas Krause Announcements Recitations: Every Tuesday 4-5:30 in 243 Annenberg Homework 1 out. Due in class
More informationELIZABETH S. ALLMAN and JOHN A. RHODES ABSTRACT 1. INTRODUCTION
JOURNAL OF COMPUTATIONAL BIOLOGY Volume 13, Number 5, 2006 Mary Ann Liebert, Inc. Pp. 1101 1113 The Identifiability of Tree Topology for Phylogenetic Models, Including Covarion and Mixture Models ELIZABETH
More informationOverview of Course. Nevin L. Zhang (HKUST) Bayesian Networks Fall / 58
Overview of Course So far, we have studied The concept of Bayesian network Independence and Separation in Bayesian networks Inference in Bayesian networks The rest of the course: Data analysis using Bayesian
More informationEVOLUTIONARY DISTANCES
EVOLUTIONARY DISTANCES FROM STRINGS TO TREES Luca Bortolussi 1 1 Dipartimento di Matematica ed Informatica Università degli studi di Trieste luca@dmi.units.it Trieste, 14 th November 2007 OUTLINE 1 STRINGS:
More informationProbabilistic Machine Learning
Probabilistic Machine Learning Bayesian Nets, MCMC, and more Marek Petrik 4/18/2017 Based on: P. Murphy, K. (2012). Machine Learning: A Probabilistic Perspective. Chapter 10. Conditional Independence Independent
More informationLearning Bayesian network : Given structure and completely observed data
Learning Bayesian network : Given structure and completely observed data Probabilistic Graphical Models Sharif University of Technology Spring 2017 Soleymani Learning problem Target: true distribution
More informationLikelihood Geometry. June Huh and Bernd Sturmfels
Likelihood Geometry June Huh and Bernd Sturmfels arxiv:1305.7462v2 [math.ag] 18 Sep 2013 Abstract We study the critical points of monomial functions over an algebraic subset of the probability simplex.
More informationBinomial Exercises A = 1 1 and 1
Lecture I. Toric ideals. Exhibit a point configuration A whose affine semigroup NA does not consist of the intersection of the lattice ZA spanned by the columns of A with the real cone generated by A.
More informationSTAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01
STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01 Nasser Sadeghkhani a.sadeghkhani@queensu.ca There are two main schools to statistical inference: 1-frequentist
More informationParametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012
Parametric Models Dr. Shuang LIANG School of Software Engineering TongJi University Fall, 2012 Today s Topics Maximum Likelihood Estimation Bayesian Density Estimation Today s Topics Maximum Likelihood
More informationProbabilistic Systems Analysis Spring 2018 Lecture 6. Random Variables: Probability Mass Function and Expectation
EE 178 Probabilistic Systems Analysis Spring 2018 Lecture 6 Random Variables: Probability Mass Function and Expectation Probability Mass Function When we introduce the basic probability model in Note 1,
More informationMaximum Likelihood Estimation in Latent Class Models for Contingency Table Data
Maximum Likelihood Estimation in Latent Class Models for Contingency Table Data Stephen E. Fienberg Department of Statistics, Machine Learning Department, Cylab Carnegie Mellon University May 20, 2008
More informationOutline. Limits of Bayesian classification Bayesian concept learning Probabilistic models for unsupervised and semi-supervised category learning
Outline Limits of Bayesian classification Bayesian concept learning Probabilistic models for unsupervised and semi-supervised category learning Limitations Is categorization just discrimination among mutually
More informationALGEBRA: From Linear to Non-Linear. Bernd Sturmfels University of California at Berkeley
ALGEBRA: From Linear to Non-Linear Bernd Sturmfels University of California at Berkeley John von Neumann Lecture, SIAM Annual Meeting, Pittsburgh, July 13, 2010 Undergraduate Linear Algebra All undergraduate
More informationSequence labeling. Taking collective a set of interrelated instances x 1,, x T and jointly labeling them
HMM, MEMM and CRF 40-957 Special opics in Artificial Intelligence: Probabilistic Graphical Models Sharif University of echnology Soleymani Spring 2014 Sequence labeling aking collective a set of interrelated
More informationDensity Estimation. Seungjin Choi
Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/
More information