MAXIMUM LIKELIHOOD ESTIMATION AND EM FIXED POINT IDEALS FOR BINARY TENSORS. Daniel Lemke

Size: px
Start display at page:

Download "MAXIMUM LIKELIHOOD ESTIMATION AND EM FIXED POINT IDEALS FOR BINARY TENSORS. Daniel Lemke"

Transcription

1 MAXIMUM LIKELIHOOD ESTIMATION AND EM FIXED POINT IDEALS FOR BINARY TENSORS Daniel Lemke Version of May 27, 2016

2 Contents 1 Introduction Maximum Likelihood Estimation Results Background Math Nonnegative Rank Tensors of Bounded Nonnegative Rank Maximum Likelihood Estimation: A Closer Look EM Algorithm for Matrices Ideals, Varieties, and Algorithms EM Fixed Point Ideal Extension of EM Algorithm to Tensors EM Fixed Point Ideal for Tensors MLE Using Boundary Strata Boundary Stratification of Binary Tensors Experiments Implementation Cellular Decomposition, Primality, and Primary Decomposition EM, MLE, and Boundary Strata Experiments Conclusion 31 Bibliography 32 2

3 I would like to express my deepest gratitude to Serkan Hoşten and Kaie Kubjas for providing the ideas and framework found in this thesis. Thanks go to Nathanael Aff for programming guidance and to Michelle Lemke Riggs and Matthias Beck for their special awareness of English grammar. 3

4 1 Introduction The term Algebraic Statistics first appeared in the literature as the title of a 2001 book by Giovanni Pistone, Eva Riccomagno, and Henry Wynn [PRW]. Beginning with an introduction to Gröbner bases, it presents the application of polynomial algebra to statistics, discrete probability, and experimental design. In 2005 Lior Pachter and Bernd Sturmfels published a single-volume collection of works titled Algebraic Statistics for Computational Biology [PS]. It was written by an array of professionals and graduate students from the fields of algebra and computational biology. The book provides a thorough treatment of the basic principles of algebraic statistics and their relationship to computational biology, and presents an emerging dictionary between algebraic geometry and statistics. Our research is a continuation of the work of Sturmfels et al. The story picks up with a well-known problem in statistics called Maximum Likelihood Estimation. 1.1 Maximum Likelihood Estimation The likelihood of a set of data is the probability of observing that particular set of data, given some statistical model, which is just a family of probability distributions. The values of the parameters that maximize the sample likelihood function are known as the Maximum Likelihood Estimates or MLEs. MLEs have been studied since the dawn of the 20th century and were made popular by the statistician and biologist Sir Ronald Fisher [Wik]. Consider the following example from [PS] 1.1. Suppose we generate a DNA sequence by rolling three tetrahedral dice, each labelled A, C, G, and T, for nucleobases adenine, cytosine, guanine, and thymine. Two of the dice are unfair, one is fair, and suppose they have the associated probabilities of Table 1.1. Table 1.1 A C G T first die second die third die T C C A A G We generate the DNA sequence CTCACGTGATGAGAGCATTCTCAGACCGTGACGCGTGTAGCAGCGGCTC 4

5 by selecting the first die with probability θ 1, the second with probability θ 2, and the third with probability 1 θ 1 θ 2. We would like to determine the parameters θ 1 and θ 2 that were used to select the dice. This amounts to a problem in optimization. Let p A, p C, p G, and p T denote the probabilities that will generate any of the four letters. The statistical model derived from Table 1.1 is written algebraically as follows: p A = 0.10 θ θ , p C = 0.08 θ θ , p G = 0.11 θ θ , p T = 0.09 θ θ We emphasize that these are polynomials in the unknowns θ 1 and θ 2. In statistical terminology these unknowns are called the model parameters. Each of the 49 characters was generated independently, so the likelihood of observing the above DNA sequence is the product of the probabilities of observing the individual letters: L(θ 1, θ 2 ) = p C (θ 1, θ 2 ) p T (θ 1, θ 2 ) p C (θ 1, θ 2 ) p A (θ 1, θ 2 ) p C (θ 1, θ 2 ) = p 10 A (θ 1, θ 2 ) p 15 C (θ 1, θ 2 ) p 15 G (θ 1, θ 2 ) p 10 T (θ 1, θ 2 ). In the maximum likelihood framework we estimate the unknown parameters that were used with those values in the parameter space which make the likelihood of observing the data as large as possible. The parameter space over which we maximize L(θ 1, θ 2 ) is the triangle Θ = {(θ 1, θ 2 ) R 2 : θ 1 > 0 and θ 2 > 0 and θ 1 + θ 2 1}. It is simpler and equivalent to maximize the log of the likelihood function, denoted l(θ): l(θ) = log(l(θ 1, θ 2 )) = 10 log(p A (θ 1, θ 2 )) + 14 log(p C (θ 1, θ 2 )) + 15 log(p G (θ 1, θ 2 )) + 10 log(p T (θ 1, θ 2 )), and we can obtain the solution to this optimization problem using techniques from Calculus. Optimization yields the maximum likelihood estimate (ˆθ 1, ˆθ 2 ) = ( , ). One of the drawbacks of MLEs, in terms of popular use, is that it is in general a nonconvex optimization problem requiring solutions to complicated nonlinear systems of equations. It is common in practice to circumvent these issues by using the hill-climbing Expectation Maximization (EM) algorithm, one of the main topics of this thesis. However, any algorithm of this type is doomed to imperfection. It will inevitably run into the problem of being trapped in local maxima and will have no way of providing a certificate for having found the global optimum, which may or may not exist [KRS +, pg. 2]. 1.2 Results We analyze the behavior of the EM algorithm in the case where the model M, the space over which we are optimizing, consists of data arrays of nonnegative rank 2 (cf. 2.1). M is a nonconvex, compact, semi-algebraic subset of a 7-dimensional tetrahedron. 5

6 Figure 1.1: Representative picture of the 7-dimensional model M as a 3-dimensional nonconvex, nonlinear, compact subset of the 3-dimensional tetrahedron. Since maximum likelihood estimation is an optimization problem, in order to locate the global optimum one restricts the objective function to the the interior and to each boundary, finds the maximum on each of these strata, and picks the best value among them. Allman, Hosten, Rhodes, and Zwiernik [AHRZ] give exact formulas for the maxima on each boundary stratification of M. [AHRZ] follows [ARSZ] in which M is realized as those probability distributions satisfying a special set of polynomial equalities and inequalities. We analyze the [AHRZ] formulas by determining how often they produce MLEs within M. We determine the strata of M that the EM algorithm is most attracted to, find the frequency with which the EM algorithm locates the global optimum, and count the number of times the EM algorithm must be run in order to find the MLE. We also compare the computation times of running the algorithm against using [AHRZ], and produce a picture of the behavior of the algorithm on a 3-dimensional slice of the 7-dimensional model M. We also analyze an algebraic approach to the EM algorithm and MLE. The EM fixed points are all the points that the EM algorithm can potentially converge to. These points represent the entire collection of maxima in the relative interior of M, as well as maximizers on the boundaries of M, and can be realized as the vanishing set of a collection of polynomials. We find these polynomials, following in the footsteps of [KRS + ], and describe the set of all EM fixed points to maximum likelihood problems of two separate classes of data arrays. In total we discuss and compare three approaches to the maximum likelihood problem on M; one is algorithmic, one is formulaic, and one is algebraic. In Chapter 2 we cover the background math necessary to understand Chapters 3 and 4. These concepts include MLE, nonnegative rank, tensors of bounded nonnegative rank, and the EM algorithm for matrices. We also discuss ideals, varieties, and primary decomposition. In Chapter 3 we describe the EM fixed point ideal for binary tensors of nonnegative rank less than or equal to 2 and 3, we describe cellular decomposition, which was used to produce these ideals, and we provide tables completely characterizing these ideals. In Chapter 4 we provide results on MLE using the boundary strata given in [ARSZ] and [AHRZ]. 6

7 2 Background Math 2.1 Nonnegative Rank The nonnegative rank of a nonnegative matrix A R m n, denoted rank + (A), is the smallest r Z 0 such that A = B C for nonnegative B R m r and nonnegative C R r n. Equivalently, it is the smallest r such that A can be written as the sum of r nonnegative rank 1 matrices, A = r i=1 x i y i for x i R m 1 0, y i R 1 n 0. Rank is always less than or equal to nonnegative rank. The smallest case for which rank and nonnegative rank disagree is for m = n = 4. [CR] provides the standard example. It is shown that the matrix has rank + = 4, but by observing linear dependence, or that = , we see that this matrix is rank 3 in the usual sense. Stephen Vavasis shows that nonnegative matrix factorization is NP-hard in [Vav]. 7

8 2.2 Tensors of Bounded Nonnegative Rank A real nonnegative tensor is a multidimensional array in R d 1 d 2 d n 0. A vector is a 1- dimensional tensor, a matrix is a 2-dimensional tensor, and a 3-or-higher dimensional tensor is just a tensor. Figure 2.1: A and tensor. Image sources: [Kar] & [Wal]. The cells of a Rubik s Cube represent a tensor and a labelling of the vertices of a 4-dimensional cube represent a tensor. Example 2.1. Let a = (a 1, a 2 ), b = (b 1, b 2 ), c = (c 1, c 2 ) R 2 0, then a b c is a nonnegative rank 1, tensor and can be written in slices as ( ) a1 b 1 c 1 a 1 b 1 c 2 a 2 b 1 c 1 a 2 b 1 c 2. a 1 b 2 c 1 a 1 b 2 c 2 a 2 b 2 c 1 a 2 b 2 c 2 This is just one view of the tensor (front-to-back), but that it is rank 1 in the usual sense. Indeed, each slice is a linear combination of the other, independent of the viewpoint. a 2 b 1 c 1 a 2 b 1 c 2 a 1 b 1 c 1 a 1 b 1 c 2 a 2 b 2 c 1 a 2 b 2 c 2 a 1 b 2 c 1 a 1 b 2 c 2 Figure 2.2: This tensor can be written in different slices as viewed from the top-down, bottom-up, left-right, and right-left. A tensor P of format d 1 d 2 d n has nonnegative rank at most r, if r is the smallest natural number such that P can be written as the sum of r nonnegative rank 1 tensors. Thus 8

9 we can build tensors of arbitrary nonnegative rank by adding nonnegative rank 1 tensors. A rank + r tensor of this form can be written P = a 11 a 12 a 1n + a 21 a 22 a 2n + + a r1 a r2 a rn (2.1) with a ij R 0. Example 2.2. Let P = p ijk be a real tensor. Then P is nonnegative rank 2 if there exists nonnegative 2 2-matrices [ ] [ ] [ ] a11 a A = 12 b11 b, B = 12 c11 c, C = 12 a 21 a 22 b 21 b 22 c 21 c 22 such that p ijk = a 1i b 1j c 1k + a 2i b 2j c 2k. c 1k c 2k = + b 1j b 2j a 1i a 2i Figure 2.3: Rank + 2 tensor decomposition, adapted from [KB], depicting a general rank + 2 tensor being constructed by adding rank + 1 tensors, which are themselves built from the rows of the nonnegative matrices A, B, and C. It is shown in [Lan, 5.5] that the set of real tensors P = [p i1 i 2 i n ] of format d 1 d 2 d n of nonnegative rank 2 is a closed semialgebraic subset of dimension 2(d 1 + d d n ) 2(n 1). Throughout, we informally refer to the set of tensors of some dimensions and rank as a space of tensors of some dimensions and rank. Definition 2.1 ([ARSZ]). Suppose P is of the form (2.1) with n 3, d i 2, and r = 2. Pick any subset A of [n] = {1, 2,..., n} with 1 A n 1 and write the tensor P as an ordinary matrix with i A d i rows and j A d j columns. The flattening rank of P is the maximal rank of any of these matrices. Definition 2.2 ([ARSZ]). Fix a tuple π = (π 1, π 2,..., π n ) where π i {1,..., d i }. Then P is π-supermodular if is a permutation of p i1 i 2 i n p j1 j 2 j n p k1 k 2 k n p l1 l 2 l n (2.2) whenever {i r, j r } = {k r, l r } and π r (k r ) π r (l r ) holds for r = 1, 2,..., n. A tensor P is called supermodular if it is π-supermodular for some π. 9

10 Theorem 2.1 ([ARSZ]). A nonnegative tensor P has nonnegative rank at most 2 if and only if P is supermodular and has flattening rank at most 2. Example 2.3. Let P = [p ijkl ] be a real tensor. P has flattening-rank at most 2 for any solutions to the systems of equations defined by the 3-minors of the matrices p 1111 p 1112 p 1121 p 1122 p 1111 p 1112 p 1211 p 1212 p 1111 p 1121 p 1211 p 1221 p 1211 p 1212 p 1221 p 1222 p 2111 p 2112 p 2121 p 2122, p 1121 p 1122 p 1221 p 1222 p 2111 p 2112 p 2211 p 2212, p 1112 p 1122 p 1212 p 1222 p 2111 p 2121 p 2211 p 2221 p 2211 p 2212 p 2221 p 2222 p 2121 p 2122 p 2221 p 2222 p 2112 p 2122 p 2212 p 2222 (2.3) obtained by setting n = 4 and A = {1, 2}, A = {1, 3}, and A = {1, 4}, respectively, in Definition 2.1. Since A and A c yield transpose matrices, and since A = {1} results in a 2 8-matrix, there are no other 3-minors to consider. Example 2.4 ([ARSZ]). Let P = [p ijk ] be a real tensor. As in Example 2.2, p ijk = a 1i b 1j c 1k + a 2i b 2j c 2k. In this case there are no flattening rank conditions since for each flattening there are no 3-minors. For π = (id, id, id), the binomial inequalities for supermodularity are p 111 p 222 p 112 p 221 p 111 p 222 p 121 p 212 p 111 p 222 p 211 p 122 p 112 p 222 p 122 p 212 p 121 p 222 p 122 p 221 p 211 p 222 p 212 p 221 p 111 p 122 p 112 p 121 p 111 p 212 p 112 p 211 p 111 p 221 p 121 p 211. (2.4) Nonnegative tensors P that satisfy these nine inequalities lie in the set M id,id,id = M (12),(12),(12). By label swapping 1 2, we obtain three other sets M, M id,id,(12) id,(12),id = M, and M (12),id,(12) (12),id,id = M. Thus, by definition, the semialgebraic set of all id,(12),(12) supermodular tensors is the union M = M id,id,id M M id,id,(12) id,(12),id M (12),id,id. (2.5) Theorem 2.1 states that P R has nonnegative rank 2 if and only if P lies in M. 2.3 Maximum Likelihood Estimation: A Closer Look When dealing with statistical models involving discrete data we may identify the sample space with the set of the first m positive integers, [m] := {1, 2,..., m}. A probability distribution on the set [m] is a point in the probability simplex { } m m 1 := (p 1,..., p m ) R m : p i = 1 and p j 0 for all j. i=1 10

11 The algebraic statistical model is a natural generalization of the ordinary statistical model. It comes as the image of a polynomial map f : R d R m, θ = (θ 1, θ 2,..., θ d ) (f 1 (θ), f 2 (θ),..., f m (θ)). (2.6) Each f i is a polynomial in R[θ 1,..., θ d ] and θ 1,..., θ d are the model parameters. Furthermore, (θ 1, θ 2,..., θ d ) is a point in Θ, a non-empty open subset of R d called the parameter space of the model f. We assume that Θ satisfies f i (θ) > 0 for all i [m] and θ Θ. Since the data is discrete, it can be given in the form of a sequence of observations i 1, i 2,..., i N (2.7) where each i j is an element from the sample space [m]. The integer N is the sample size. This data can be summarized in the data vector u = (u 1, u 2,..., u m ) where u k is the number of indices j [N] such that i j = k. Hence u N m, where N = {0, 1, 2,...}, and u 1 + u u m = N. The empirical distribution corresponding to the data (2.7) is the scaled vector (1/N)u which is a point in the probability simplex. We consider the model f to be a good fit for the data u if there exists a parameter vector θ Θ such that the probability distribution f(θ) is close, in a statistically meaningful way, to the empirical distribution (1/N)u. Were we to draw N times at random from the set [m] with respect to the probability distribution f(θ), then the probability of observing the sequence (2.7) gives the likelihood function L(θ) = f i1 (θ)f i2 (θ) f in (θ) = f 1 (θ) u 1 f 2 (θ) u2 f m (θ) um. (2.8) Since u represents the observed data it is thus fixed, and L depends only on θ; therefore, L is a function from Θ to R >0. It is equivalent but simpler to deal with the log of the likelihood function, l(θ). The problem of maximum likelihood estimation is to maximize l(θ) where θ ranges over the the parameter space Θ. Put plainly, we aim to solve the optimization problem: maximize l(θ) subject to θ Θ. (2.9) A solution to (2.9) is called a maximum likelihood estimate of θ with respect to the model f and the data u, and is denoted ˆθ. For many statistical models, a maximum likelihood estimate may not exist, and if it does, there could be more than one global maximum; actually, there can be infinitely many of them [PS]. Also, it may be difficult to find any one of these global maxima. This is where the Expectation Maximization (EM) algorithm enters the picture. It is a numerical method for finding solutions to (2.9), but it also gives insight, like shading paper over a leaf, into the topology of the model M. For a detailed treatment of maximum likelihood estimation in the context of computational biology, see [PS] 1.1, 1.3, and 3.3, from which the above exposition is derived. Let s consider maximum likelihood estimation in a less general setting. The rth mixture model M of two discrete random variables X and Y expresses the conditional statement X Y Z, where Z is a hidden variable with r states 1. Now, 1 Imagine having data on hair length and height. The hidden variable is gender and has r states, depending on how one chooses to classify gender. 11

12 assuming X and Y have m and n states respectively, their joint distribution is written as an m n-matrix of nonnegative rank r whose entries sum to 1. Let the nonnegative matrix u 11 u 1n U =..... u m1 u mn be a collection of independent and identically distributed samples from a joint distribution. Here, u ij is the number of observations in the sample with X = i and Y = j. The sample size is u ++ = i,j u ij. The EM algorithm attempts to maximize the log-likelihood function (2.12) of the model M. It approximates the data matrix U with a product of nonnegative matrices A and B where A R m r 0 and B R r n 0. As mentioned in the introduction, this is a nonconvex optimization problem, and any algorithm that attempts to solve it will run into a host of problems, of which the following dichotomy is most fundamental: either the MLE ˆP lies in the relative interior of the model M, or it lies in the boundary M of the model. If ˆP lies in M, then it is generally not a critical point for the likelihood function in the space of rank r matrices. It is shown in [KRS + ] that for 8 8-matrices of nonnegative rank 5, 96% of data matrices have MLEs lying in the boundary M. Let mn 1 denote the probability simplex of nonnegative m n-matrices P = [p ij ]. The model M is the subset of mn 1 consisting of all matrices of the form P = A Λ B, (2.10) where A is a nonnegative m r-matrix whose columns sum to 1, Λ is a nonnegative r r diagonal matrix whose entries sum to 1, and B is a nonnegative r n-matrix whose rows sum to 1. The kth column of A represents the conditional probability distribution of X given Z = k; the kth row of B represents the conditional probability distribution of Y given Z = k, and the diagonal of Λ is the probability distribution of Z. The parameter space in which (A, Λ, B) lies is the convex polytope Θ = ( m 1 ) r r 1 ( n 1 ) r. The model M is the image of the trilinear map Θ mn 1, (A, Λ, B) P. We aim to learn the model parameters (A, Λ, B) by maximizing the likelihood function ( u++ u ) m n i=1 j=1 p u ij ij (2.11) or equivalently, by maximizing the log-likelihood function ( m n r ) l U = u ij log a ik λ k b kj i=1 j=1 k=1 (2.12) over M. 12

13 2.4 EM Algorithm for Matrices The EM algorithm for m n-matrices is an iterative method for finding local maxima of the likelihood function (2.12). Algorithm 1 presents the version in [PS], 1.3. Algorithm 1 Function EM(U, r) Select random a 1, a 2,..., a r m 1, random λ r 1, and random b 1, b 2,..., b r n 1. Run the following steps until the entries of the m n-matrix P converge. E-Step: Estimate the m r n-table that represents this expected hidden data: Set v ikj := a ikλ k b kj r l=1 a ilλ l b lj u ij for i = 1,..., m, k = 1,..., r, and j = 1,..., n. M-Step: Maximize the likelihood function of the model for the hidden data: Set λ k := m n i=1 i=1 v ikj/u ++ for k = 1,..., r. Set a ik := n j=1 v ikj/u ++ for k = 1,..., r, i = 1,..., m. Set b kj := n i=1 v ikj/u ++ for k = 1,..., r, j = 1,..., n. Update the estimate of the joint distribution for our mixture model: Set p ij := r k=1 a ikλ k b kj for i = 1,..., m, j = 1,..., n. Return P. The alternating sequence estimation steps and maximization steps (E- and M-steps) defines trajectories in the parameter polytope Θ. The log-likelihood function (2.12) is nondecreasing along each trajectory (cf. [PS], Theorem 1.15). The value can remain unchanged only at a fixed point of the EM algorithm. Definition 2.3. An EM fixed point for a given table U is any point (A, Λ, B) in the polytope Θ = ( m 1 ) r r 1 ( n 1 ) r to which the EM alogorithm can converge if it is applied to (U, r). Lemma 2.2 ([KRS + ]). The following are equivalent for a point (A, Λ, B) in the parameter polytope Θ: 1. The point (A, Λ, B) is an EM fixed point 2. If we start EM with (A, Λ, B) instead of a random point, then EM converges to (A, Λ, B). 3. The point (A, Λ, B) remains fixed after one E-step and one M-step. Every global maximum ˆP of l U is among the EM fixed points. [KRS + ] identify the polynomials whose roots represent all fixed points for the 4 4-matrix case. Since a point is EM fixed if and only if it stays fixed after an E-step and an M-step, we can write rational function equations for the EM fixed points in Θ. We examine this process in depth in Chapter Ideals, Varieties, and Algorithms Let R = K[x 1,..., x n ] be the ring of polynomials in n variables with coefficients in a subfield K of the real numbers R, usually the rational numbers K = Q. 13

14 Definition 2.4. A subset I R is an ideal in R if I is a subgroup of R under addition, and for every f I and every g R we have fg I. Equivalently, an ideal I is closed under taking linear combinations with coefficients in the ring R. Definition 2.5. Let K be a field and let f 1,..., f s be polynomials in K[x 1,..., x n ]. Then we set V (f 1,..., f s ) = {(a 1,..., a n ) K n : f i (a 1,..., a n ) = 0 for all 1 i s}. We call V (f 1,..., f s ) the variety defined by f 1,..., f s. Let T = f 1,..., f s. The ideal generated by T, denoted T, is the smallest ideal in R containing T. We use V (T ) in place of V ( T ). In computational algebra, we often replace T by a Gröbner basis of T. This allows us to test ideal membership and to determine geometric properties of the variety V (T ) [CLO]. Definition 2.6. A subset X C n is a variety if X = V (T ) for some T R. A variety X C n is irreducible if we cannot write X = X 1 X 2, where X 1, X 2 X are strictly smaller varieties. An ideal I R is prime if fg I implies f I or g I. Proposition 2.3. The variety X is irreducible if and only if I(X) is prime. An ideal is radical if it is an intersection of prime ideals. Proposition 2.4. Every variety X can be written uniquely as X = X 1 X 2 X m, where X 1, X 2,..., X m are irreducible and none of these m components contain any other. Moreover, I(X) = I(X 1 ) I(X 2 ) I(X m ) is the unique decomposition of radical ideal I(X) as an intersection of prime ideals. A minimal prime of an ideal I is a prime ideal J such that V (J) is an irreducible component of V (I). Definition 2.7. An ideal I in K[x 1,..., x n ] is primary if fg I implies either f I and g m I for some m > 0. Lemma 2.5. If an ideal I is primary, then I is prime, and it is the smallest prime ideal containing I. All ideals I in R can be written as intersections of primary ideals, that is, a decomposition I = Q 1 Q 2 Q s where each Q i is primary. The radical P = Q is a prime ideal and Q is called P -primary. Primary ideals are more general than prime ideals, but they still define irreducible varieties, and geometrically primary ideals contain the same information as do their prime counterparts. Definition 2.8. Let I K[x 1,..., x n ] be an ideal, and f K[x 1,..., x n ]. Then the saturation of I with respect to f is the ideal (I : f ) = g K[x 1,..., x n ] : gf m I for some m > 0. Saturating an Ideal I by a polynomial f geometrically means that we obtain a new ideal J = (I : f ) whose variety V (J) contains all components of V (I) except for the ones on which f vanishes. For more on these concepts see [CLO], from whence this section is derived. 14

15 3 EM Fixed Point Ideal 3.1 Extension of EM Algorithm to Tensors Maximum likelihood estimation and the EM algorithm for matrices extend naturally to data given in the form of a tensor, which is just a table of dimension higher than 2. Here we restate the MLE problem and the EM algorithm for tensors of nonnegative rank 2 and describe the ideal of EM fixed points. We begin by updating the paramater polytope Θ to ( ) 2. A point in Θ is of the form (A, B, C, Λ) where A R 2 2 0, B R2 2 0, and C R2 2 0 are nonnegative and row stochastic, and Λ R is a nonnegative diagonal 2 2-matrix. The model M is the image of the quadrilinear map Θ 7, (A, B, C, Λ) P. (3.1) We update the function l U to reflect the tensor U. Now we seek to maximize ( u+++ u ) 2 i,j,k=1 where u ijk is the data and the unknowns P = [p ijk ] form a nonnegative tensor of nonnegative rank 2 with p +++ = 1. Since we do not allow p ijk = 0, P is a strict subset of the probability simplex 7. Again, this is equivalent to maximizing the log-likelihood function l U = u ijk log(p ijk ) = ( r ) u ijk log λ l a li b lj c lk. (3.2) i,j,k i,j,k l=1 In Algorithm 2 we update the EM Algorithm for matrices to reflect the new format of the data. p u ijk ijk 15

16 Algorithm 2 Function EM(U, r). i, j, k = {1, 2}, r = 2, U = [u ijk ] R Select random nonnegative stochastic matrices A, B, C in R+ 2 2 and a diagonal 2 2 matrix Λ. Define two nonnegative rank 1, tensors [λ 1 a 1i b 1j c 1k ] and [λ 2 a 2i b 2j c 2k ]. Run the following steps until the entries of the tensor P converge. E-Step: Estimate the table that represents this expected hidden data: Set vijk l := λ la li b lj c lk 2 u s=1 λsa sib sj c ijk for i, j, k, l = {1, 2} sk M-Step: Maximize the likelihood function of the model for the hidden data: Set λ l := 2 i,j,k=1 vl ijk /u +++ for l = 1, 2 Set a li := 2 j,k=1 vl ijk /(u +++λ l ) for l, i = 1, 2 Set b lj := 2 i,k=1 vl ijk /(u +++λ l ) for l, j = 1, 2 Set c lk := 2 i,j=1 vl ijk /(u +++λ l ) for l, k = 1, 2 Update the estimate of the joint distribution for our mixture model: Set p ijk := 2 l=1 λ la li b lj c lk for i, j, k = 1, 2 Return P. 3.2 EM Fixed Point Ideal for Tensors As in Section 2.4, if we could compute all EM fixed points, then this would reveal the global maximizer of l U. Since a point is EM fixed if and only if it stays fixed after an E-step and an M-step, we can write rational function equations for the EM fixed points in Θ: λ l = 1 u +++ a li = b lj = c lk = 1 u +++ λ l 1 u +++ λ l 1 u +++ λ l 2 i,j,k=1 2 j,k=1 2 i,k=1 2 i,j=1 λ l a li b lj c lk 2 s=1 λ sa si b sj c sk u ijk for l = 1, 2, λ l a li b lj c lk 2 s=1 λ sa si b sj c sk u ijk for i, l = 1, 2, λ l a li b lj c lk 2 s=1 λ sa si b sj c sk u ijk for j, l = 1, 2, λ l a li b lj c lk 2 s=1 λ sa si b sj c sk u ijk for k, l = 1, 2. Our goal is to understand the solutions to these equations for a fixed tensor U. We seek to find the variety they define in the polytope Θ and the image of that variety in M. In the EM algorithm we usually start with a li, b lj, c lk, λ l that are strictly positive. The a li, b lj, c lk may become zero in the limit, but the parameters λ k always remain positive when the u ijk are positive since the rows of A, B, C sum to 1. This justifies that we cancel out the factors λ k in our equations. After this, the first equation is implied by the other three. Therefore, 16

17 the set of all EM fixed points is a variety, and it is characterized by a li = 1 2 a li b lj c lk u s=1 λ u ijk for i, l = 1, 2, sa si b sj c sk b lj = 1 u +++ c lk = 1 u +++ j,k=1 2 i,k=1 2 i,j=1 a li b lj c lk 2 s=1 λ sa si b sj c sk u ijk for j, l = 1, 2, a li b lj c lk 2 s=1 λ sa si b sj c sk u ijk for k, l = 1, 2. These equations can be simplified further, for example, a li = 1 2 a li b lj c lk u s=1 λ u ijk = sa si b sj c sk a li u +++ j,k=1 a li u +++ = 2 b lj c lk = j,k=1 j,k=1 2 j,k=1 2 j,k=1 a li b lj c lk 2 s=1 λ sa si b sj c sk u ijk = a li b lj c lk 2 s=1 λ sa si b sj c sk u ijk = ( 2 ( ) ) u ijk a li u s=1 λ b lj c lk = 0 = sa si b sj c sk a li ( 2 j,k=1 ( u +++ u ) ) ijk b lj c lk = 0, for l, i = 1, 2. p ijk Note that 2 j,k=1 b ljc lk = 1 and the last line, in part, follows from the identity p ijk = 2 λ l a li b lj c lk. We can do this simplification symmetrically for b lj and c lk yielding ( 2 ( a li u +++ u ) ) ijk b lj c lk = 0 for i, l = 1, 2, p ijk j,k=1 ( 2 ( b lj u +++ u ) ) ijk a li c lk = 0 for j, l = 1, 2, p ijk i,k=1 ( 2 ( c lk u +++ u ) ) ijk a li b lj = 0 for k, l = 1, 2. p i,j=1 ijk l=1 Therefore, the set of EM fixed points is a variety characterized by the above equations. We can simplify further, denoting by R the tensor with entries r ijk = u +++ u ijk p ijk 17

18 and the fixed point equations become ( 2 ) a li r ijk b lj c lk = 0 for all l, i = 1, 2, j,k=1 ( 2 ) b lj r ijk a li c lk = 0 for all l, j = 1, 2, i,k=1 ( 2 ) c lk r ijk a li b lj = 0 for all l, k = 1, 2. i,j=1 This derivation yields the following theorem. Theorem 3.1. The variety of EM fixed points for tensors of rank + 2 in the polytope Θ is defined by the equations ( 2 ) a li r ijk b lj c lk = 0 for all l, i = 1, 2, j,k=1 ( 2 ) b lj r ijk a li c lk = 0 for all l, j = 1, 2, i,k=1 ( 2 ) c lk r ijk a li b lj = 0 for all l, k = 1, 2, i,j=1 where [r ijk ] = [ u +++ u ijk p ijk ]. The variety defined in Theorem 3.1 is reducible. Definition 3.1. Let F be the ideal of EM fixed points, as in Theorem 3.1. A minimal prime of F is called relevant if it contains none of the 8 polynomials p ijk = 2 l=1 a lib lj c lk. Theorem 3.2. The ideal F of EM fixed points for tensors of nonnegative rank 2 has precisely 52 minimal primes consisting of 9 orbital classes. Moreover, the ideal is radical, hence, it equals the intersection of its minimal primes. Proof. While the ideal F is not a binomial ideal, we follow [KRS + ] in using an approach based on the primary decomposition of binomial ideals given in [ES, 6]. Let F be the EM fixed point ideal ( 2 ) ( 2 ) ( 2 ) a li r ijk b lj c lk, b lj r ijk a li c lk, c lk r ijk a li b lj : i, j, k, l = 1, 2. i,j=1 j,k=1 i,k=1 Any prime ideal containing F contains either a li or 2 j,k=1 r ijkb lj c lk for l, i {1, 2}, and either b lj or 2 i,k=1 r ijka li c lk for l, j {1, 2}, and either c lk or 2 i,j=1 r ijka li b lj for l, k 18

19 {1, 2}. We categorize all primes containing F according to the set S of unknowns a li, b lj, and c lk. There are 2 12 subsets and the symmetry group acts on this power set by permuting the rows of A, B, and C simultaneously, the columns of A, B, and C separately, and the matrices A, B, and C themselves. We pick one representative S from each orbit that is relevant, that is, we exclude those orbits for which p ijk = 2 l=1 a lib lj c lk = 0. These are exactly the orbits containing an element p ijk lying in the ideal S. For each relevant representative S, we compute the cellular component F S = ((F + S ) : ( S c ) ), where S c = {a 11,...a 22, b 11,..., b 22, c 11,...c 22 }\S. Next we minimalize our cellular decomposition by removing all representatives S such that F T F S for some representative T in another orbit. This leads to a list of 6 orbits comprising 11 ideals. Up to symmetry, each prime is uniquely determined by its attributes in Table 3.1. These are its set S, its degree and codimension, the ranks ra = rank(a), rb = rank(b), and rc = rank(c) at a generic point, the number of ideals in the orbit of S, and the number of elements in the primary decomposition. In each case, primality of the ideal was verified using either the Macaulay2 isprime function or the linear elimination sequence in [GSS, Proposition 23(b)], which we discuss in detail in 5.1. Table 3.1 Minimal primes of EM fixed point ideal F for tensors of rank + 2. Class S S a s b s c s deg codim ra rb rc orbit #prime s { } {a 11 } {a 11, b 11 } {a 11, a 12 } {a 11, a 22 } {a 11, b 11, c 11 } In Table 3.1, while the ideals given by { } determine the fixed points in the interior of M, the ideals given by {a 11 }, {a 11, b 11 }, {a 11, a 22 }, and {a 11, b 11, c 11 } determine the fixed points on the non-interior boundary strata of M, as seen in [AHRZ]. That is, the map (3.1) sends any elements in the parameter polytope Θ = (A, B, C, Λ) that vanish under the defining equations of these ideals to a boundary strata of M. The ideal given by {a 11, a 12 } is degenerate because it yields a probability distribution outside of M. If a 11 = a 12 = 0 19

20 then A is not a stochastic matrix since the first row of A does not sum to 1. While #primes corresponding to the { } is 5 we only show the attributes of three. The two minimal primes not appearing in the list can be obtained as group actions of the ones in the list. We refer to the minimal primes appearing in the list as representatives of orbital classes. The total number of minimal primes can be read off the table by summing the product of the columns orbit and #primes. The orbit is the number of parameters in the orbit of the element from column 1. For example, the orbit of {a 11, a 22 } is {b 11, b 22 }, {c 11, c 22 }, so orbit = 3. The most concise ideal is the minimal prime defined by setting a 11 = 0 and a 22 = 0, corresponding to one of two 5-dimensional substrata of M. It has defining equations a 1,1, a 2,2, b 1,1 r 2,1,2 + b 1,2 r 2,2,2, b 1,1 r 2,1,1 + b 1,2 r 2,2,1, b 2,1 r 1,1,2 + b 2,2 r 1,2,2, b 2,1 r 1,1,1 + b 2,2 r 1,2,1, c 1,1 r 2,1,1 + c 1,2 r 2,1,2, c 1,1 r 2,2,1 + c 1,2 r 2,2,2, c 2,1 r 1,2,1 + c 2,2 r 1,2,2, c 2,1 r 1,1,1 + c 2,2 r 1,1,2, r 1,1,2 r 1,2,1 r 1,1,1 r 1,2,2, r 2,1,2 r 2,2,1 r 2,1,1 r 2,2,2. (3.3) Recall that r ijk = u +++ u ijk u ijk = u +++ p 2 ijk l=1 λ, (3.4) la li b lj c lk thus the set of EM fixed points corresponding to a data tensor U = [u ijk ] defined by this ideal are obtained by substituting (3.4), clearing denominators, and saturating. In this case, the tensor R consists of two rank-1 slices. We also see that ra, rb, and rc are 2 since the determinants of A, B, and C do not appear in the decomposition. We extend the computations of Theorem 3.2 to the case of tensors of nonnegative rank 3. The boundary stratification of the space of tensors of nonnegative rank 3 is not known, but the parameters that yield its stratification reside within the decomposition given in Table 3.2. We update the parameter polytope Θ = (A, B, C, Λ) with a 11 a 12 b 11 b 12 c 11 c 12 1 A = a 21 a 22 B = b 21 b 22 C = c 21 c 22 Λ = λ λ 2 a 31 a 32 b 31 b 32 c 31 c 32 λ 3 where A, B, C are stochastic matrices, and 3 l=1 λ l = 1 with λ i 0. We extend the EM algorithm in the natural way with p ijk = 3 λ l a li b lj c lk. l=1 Theorem 3.3. The ideal F of EM fixed points for tensors of nonnegative rank 3 has precisely 277 minimal primes consisting of 41 orbital classes. Up to symmetry, each prime is uniquely determined by its attributes in Table 3.2. Moreover, the ideal is not radical. The ideal corresponding to { } contains embedded components. 20

21 Table 3.2 Minimal primes of EM fixed point ideal F for tensors of rank + 3. Set S S a s b s c s deg codim ra rb rc o #p s { } {a 11 } {a 11, b 11 } {a 11, b 11, c 11 } {a 11, a 12 } {a 11, a 21 } {a 11, a 22 } {a 11, a 12, a 21 } {a 11, a 12, a 21, a 22 } {a 11, a 12, b 21 } {a 11, a 12, b 21, c 21 } {a 11, a 12, b 21, b 22 } {a 11, a 21, b 11, b 21 } {a 11, a 21, b 11, b 21, c 11, c 21 } {a 11, a 21, b 11, b 22, c 11, c 22 } {a 11, a 22, b 11, b 22 } {a 11, a 22, b 11, b 22, c 11, c 22 } {a 11, a 12, a 21, b 21 } {a 11, a 12, a 21, b 21, c 21 }

22 4 MLE Using Boundary Strata 4.1 Boundary Stratification of Binary Tensors Following [ARSZ], Allman, Hosten, Rhodes, and Zwiernik completely characterize the boundary stratification of binary tensors of nonnegative rank two. For tensors of nonnegative rank 2 they give specific formulas for the ML estimate on each strata. For example, in the case there are 15 ridges of dimension 5. One of these stratum is obtained as the images of those parameters (A, B, C, Λ) where a 11 = 0 and a 22 = 0. The resulting tensor p ijk is of the form λ 1 ( 0 0 a12 b 11 c 11 a 12 b 11 c a 12 b 12 c 11 a 12 b 12 c 12 ) + λ 1 ( a22 b 21 c 21 a 22 b 21 c a 22 b 22 c 21 a 22 b 22 c [AHRZ] give the following ML formula corresponding to this 5-dimensional ridge completely in terms of the data U = [u ijk ], ˆp ijk = u ij+ u i+k u i++ u +++ i, j, k = 1, 2. For all of the strata there is a formula in terms of the data U. If, among all the formulas for all of the stratifications, this ˆP produces the maximum output of the log-likelihood function l U, and if ˆP is supermodular, then this is the MLE for U, and the MLE for this data lies on a 5-dimensional ridge of the model. Table 4.1 completely characterizes the boundary stratification of M. For the case, we show that the [AHRZ] formula computations for the MLE are faster than the EM algorithm. We use the results in [ARSZ] and [AHRZ] to perform this experiment, along with several others, giving insight into the EM algorithm for tensors, maximum likelihood estimation, and the model M. ). 22

23 Table 4.1 Boundary stratification of tensors of nonnegative rank 2 # of Strata Dimension Zeros of (A, B, C, Λ) 1 7 (Interior) { } 6 6 a 11 = a 1i = 0 and b 1j = a 11 = a 22 = a 1i = 0 and b 1j = 0 and c 1k = λ i = 0 (rank + 1 tensors) 4.2 Experiments In the following experiments we implement all EM and MLE computations using Julia, a high-level dynamic programming language for technical computing. Julia is relatively new. It was developed by researchers at MIT and first appeared in All graphical modeling is done in R. We present the following experiments as a series of questions and answers. Experiment 4.1. In which boundary stratification of the model M is the MLE most likely to occur for a random nonnegative data tensor? We uniformly and randomly generate 10, 000 nonnegative tensors. For each tensor we use the [AHRZ] formulas to count the number of times the MLE lands on one of the five stratum of the 7-dimensional space of tensors of rank + 2. There are 31 formulas to check and for each formula we must verify that ˆp ijk is supermodular. Checking supermodularity requires verification of between 9 and 36 inequalities. In total, the computations for this experiment required 0.4 seconds. Table 4.2 shows the percentage distribution among the strata. Table 4.2 Experiment 4.1: Stratification attraction of the model M. 7-dim 6-dim 5a-dim 5b-dim 4-dim 3-dim In R, we produce a 3-dimensional picture modelling this behavior on the 7-dimensional model M. We generate Figures 4.1 and 4.2 by running the EM algorithm on 200,

24 randomly generated tensors [u ijk ] from the Jukes-Cantor slice given by [ ] [ ] [ ] [ ] u111 u 112 x y u211 u = and 212 w z =. u 121 u 122 z w u 221 u 222 y x A normalized [u ijk ] in the Jukes-Cantor slice is a point in the the 3-dimensional simplex. 99.9% of the MLEs for these tensors occur evenly distributed among the interior and the three substrata of the 5b stratum of M, labelled 5b 1, 5b 2, and 5b 3. To each of these strata we associate a color and to each of the points in the 3-d simplex we assign the color of the associated boundary stratum of the MLE at that point. This yields a partitioning of the 3-dimensional simplex. The 5b 1, 5b 2, and 5b 3 subsets form the same shape, rotated by 60. Observe the linear boundary between the 5b i subsets and the polynomial boundary of the interior. 24

25 Figure 4.1: The simplex is partitioned based on the MLE for the tensor in the Jukes-Cantor slice. The MLE of the pink points is in the interior of the model. The MLE of the light and dark turqoise points lies on the 5b1 stratum. The second row depicts the 5b1 subset from two different angles. The blank space between cells is where the shapes interlock. The second picture in the first row shows the interior and the 5b1 subsets interlocked. 25

26 Figure 4.2: The figure on the left shows the 5b 1, 5b 2, and 5b 3 subsets locked together. The figure on the right shows all 4 subsets: the interior, 5b 1, 5b 2, and 5b 3. The orange is the interior of the model. Experiment 4.2. How much does the EM algorithm vary for one data tensor U = [u ijk ]? We uniformly and randomly generate 100 nonnegative data tensors U and for each U we run the EM algorithm from 1, 000 different starting parameters. Recall that starting parameters are elements in the polytope Θ = (A, B, C, Λ). We determine the frequency with which the EM algorithm spreads, or finds MLEs, across different strata. Table 4.3 Experiments 4.2 Spreads across: 1 stratum 2 strata 3 strata 4 strata 5 strata 6 strata % of time: Table 4.3 says that given 1,000 different starting parameters the EM algorithm will find a local maxima on one particular strata 23% of the time. 32% of the time the algorithm will spread across 2 strata, and so on. This table does not address the density of these spreads. In order to address this, for each run of 1, 000 we count the number of times EM lands in one strata more than 90% of the time. We found that this will occur in 80% of the samples. Informally, this means that the EM algorithm is reliable in the sense that it tends to be attracted to one stratum most of the time. Experiment 4.3. How often does the EM algorithm produce the actual MLE given one starting parameter? We uniformly and randomly generate 10, 000 nonnegative data tensors U and run the EM algorithm on each U from one starting parameter with a maximum of 10, 000 steps. We compute the actual MLE using the [AHRZ] formulas and compare. The EM algorithm converges in 80% of the samples and produces the actual MLE 76% of the time. 26

27 Experiment 4.4. How many times must the EM algorithm be run to find the actual MLE? We input 1, 000 uniformly and randomly generated nonnegative tensors U and compute the MLE for each one using the [AHRZ] formulas. We then run the EM algorithm on each U until it returns the MLE. We tally the number of different starting parameters required to hit the MLE. The EM algorithm finds the MLE given 1 starting parameter 75.5% of the time. It requires less than 10 different starting parameters to find the MLE 96.5% of the time. Experiment 4.5. Is computing MLEs using the formulas faster than using EM Algorithm? This is a valid question because computing MLEs with the formulas is not trivial. There are 31 formulas to check. Each formula has a canonical representative, as in a 11 = 0, a 22 = 0. To obtain the ML estimates on the parameters in the orbit of the canonical representive (b 11 = b 22 = 0 and c 11 = c 22 = 0); we permute the data tensor and perform the computation as if it was the canonical representative, and then permute the ML estimate in reverse. For each formula, supermodularity must be verified. We generate 1,000 nonnegative, random, and uniform tensors U. For each U we compute the MLE using the [AHRZ] formulas and the EM algorithm with a maximum 10,000 steps to ensure convergence 80% of the time. If the EM algorithm does not find the MLE given one set of starting parameters, we discard the trial. It must be noted that the EM algorithm for tensors that we are implementing is most likely not the optimally coded EM algorithm, thus these speeds are only estimates as to the efficiency of the algorithm. We compute the mean, median, maximum, and minimum run times for each method. Table 4.4 shows that in these trials, the EM algorithm was never faster than the [AHRZ] formulas. In fact, the slowest formula time of seconds beats the fastest EM time of seconds. Table 4.4 Experiment 4.5 Formula EM Algorithm Mean: Median: Max: Min:

28 5 Implementation Here we give an overview of the computational methods used throughout this project. The computational body of work can be split into two categories: the EM fixed point ideal decompositions, and the EM algorithm along with the [AHRZ] formulas. The former consists of cellular decomposition, determining the primality of the cellular components, and the decomposing of the non-primary components. All of this is implemented in Macaulay2. The latter consists of the EM algorithm itself, the coding of the [AHRZ] formulas, and all of the support functions required for gathering data from these objects. Initially, we attempted to implement this using R, but Julia proved to be greater than 10-times faster. All of the coding for this section is done in Julia, besides the modeling of Figures 4.1 and Cellular Decomposition, Primality, and Primary Decomposition Macaulay2 is our primary tool for computing cellular decompositions, determining primality, and decomposing ideals into primary components. The most important theorem for determining primality is [GSS, Proposition 23]. It is stated therein without proof; for a concise proof, see [LS, Pg. 3]. Lemma 5.1. [GSS, Proposition 23] Let J R[x 1,..., x n ] be an ideal containing a polynomial f = gx 1 + h with g, h not involving x 1 and g a non-zero divisor modulo J. Let J 1 = J R[x 2,..., x n ] be the elimination ideal. Then J is prime if and only if J 1 is prime. Algorithm 3 Pseudocode implementation of Proposition 5.1. Input an ideal I R[x 1,..., x n ]. Create LIST : a list of all variables in the ring. Compute K: a list of generators of a Gröbner basis of I. for i in length(list ) Set f = LIST [i] for j in length(k) Set g = d(k[j])/ df # Note that the variables appear linearly. if (I : g) == I (implying g is a nonzero divisor) then I = eliminate(list [i], I) Return I. 28

29 Following elimination, Algorithm 3 yields a simpler ideal, as measured by degree and codimension. After multiple eliminations, as in all of our cases, verification of primality using the Macaulay2 isprime command takes only seconds. It is wise to update and maintain a sequence of strings representing the elimination sequence for fast verification. Consider the ideal (3.3) obtained by setting a 11 = a 22 = 0. In the syntax of Macaulay2, our algorithm will output the sequence of strings in Figure 5.1. Figure 5.1: Elimination sequence for verifying primality of the EM fixed point ideal of tensors of rank + 2 corresponding to a 11 = a 22 = 0. K = first entries gens gb I; g = diff(a_(1,1), K#1); I : ideal(g) == I I = eliminate(a_(1,1), I); K = first entries gens gb I; g = diff(a_(2,2), K#0); I : ideal(g) == I I = eliminate(a_(2,2), I); K = first entries gens gb I; g = diff(b_(1,1), K#2); I : ideal(g) == I I = eliminate(b_(1,1), I); K = first entries gens gb I; g = diff(b_(2,1), K#5); I : ideal(g) == I I = eliminate(b_(2,1), I); isprime(i) The degree of 3.3 is reduced from 25 to 9, the codimension drops from 8 to 4, and primality is verified by isprime in less than 1 second. Our most important method for finding minimal primes is found in the discussion following Proposition 23 in [GSS]. This method is based on splitting the ideal I that we wish to decompose into two parts. Given an ideal I, if there is an element f of its Gröbner basis that factors f = f 1 f 2, then I = I, f1 I, f 2 : f 1. (5.1) In our case, the ideals are radical, so we drop the radical signs in 5.1. We keep a list of ideals whose intersection is the same as I. For each ideal we keep a list of the elements we have inverted by so far (for example, f 1 in I, f 2 : f1 ) and saturate at each step with these elements. Eventually, the ideals either split into one or two prime parts, which we verify as in Proposition 5.1, or the splits result in ideals that are decomposable in under 5 minutes using Macaulay2 s built-in functionality. This method worked invariably for our decompositions. Algorithm 4 presents an overview of our method for computing cellular decompositions of the EM fixed point ideal for tensors of nonnegative rank 2. The code associated with this algorithm comprises the main body of our work with the EM fixed point ideals. For its implementation for 4 4-matrices, from which our code follows, see 29

Polynomials, Ideals, and Gröbner Bases

Polynomials, Ideals, and Gröbner Bases Polynomials, Ideals, and Gröbner Bases Notes by Bernd Sturmfels for the lecture on April 10, 2018, in the IMPRS Ringvorlesung Introduction to Nonlinear Algebra We fix a field K. Some examples of fields

More information

Open Problems in Algebraic Statistics

Open Problems in Algebraic Statistics Open Problems inalgebraic Statistics p. Open Problems in Algebraic Statistics BERND STURMFELS UNIVERSITY OF CALIFORNIA, BERKELEY and TECHNISCHE UNIVERSITÄT BERLIN Advertisement Oberwolfach Seminar Algebraic

More information

Introduction to Algebraic Statistics

Introduction to Algebraic Statistics Introduction to Algebraic Statistics Seth Sullivant North Carolina State University January 5, 2017 Seth Sullivant (NCSU) Algebraic Statistics January 5, 2017 1 / 28 What is Algebraic Statistics? Observation

More information

Semidefinite Programming

Semidefinite Programming Semidefinite Programming Notes by Bernd Sturmfels for the lecture on June 26, 208, in the IMPRS Ringvorlesung Introduction to Nonlinear Algebra The transition from linear algebra to nonlinear algebra has

More information

9. Birational Maps and Blowing Up

9. Birational Maps and Blowing Up 72 Andreas Gathmann 9. Birational Maps and Blowing Up In the course of this class we have already seen many examples of varieties that are almost the same in the sense that they contain isomorphic dense

More information

AN EXPOSITION OF THE RIEMANN ROCH THEOREM FOR CURVES

AN EXPOSITION OF THE RIEMANN ROCH THEOREM FOR CURVES AN EXPOSITION OF THE RIEMANN ROCH THEOREM FOR CURVES DOMINIC L. WYNTER Abstract. We introduce the concepts of divisors on nonsingular irreducible projective algebraic curves, the genus of such a curve,

More information

Geometry of Phylogenetic Inference

Geometry of Phylogenetic Inference Geometry of Phylogenetic Inference Matilde Marcolli CS101: Mathematical and Computational Linguistics Winter 2015 References N. Eriksson, K. Ranestad, B. Sturmfels, S. Sullivant, Phylogenetic algebraic

More information

ALGEBRA: From Linear to Non-Linear. Bernd Sturmfels University of California at Berkeley

ALGEBRA: From Linear to Non-Linear. Bernd Sturmfels University of California at Berkeley ALGEBRA: From Linear to Non-Linear Bernd Sturmfels University of California at Berkeley John von Neumann Lecture, SIAM Annual Meeting, Pittsburgh, July 13, 2010 Undergraduate Linear Algebra All undergraduate

More information

12. Hilbert Polynomials and Bézout s Theorem

12. Hilbert Polynomials and Bézout s Theorem 12. Hilbert Polynomials and Bézout s Theorem 95 12. Hilbert Polynomials and Bézout s Theorem After our study of smooth cubic surfaces in the last chapter, let us now come back to the general theory of

More information

Toric statistical models: parametric and binomial representations

Toric statistical models: parametric and binomial representations AISM (2007) 59:727 740 DOI 10.1007/s10463-006-0079-z Toric statistical models: parametric and binomial representations Fabio Rapallo Received: 21 February 2005 / Revised: 1 June 2006 / Published online:

More information

Mathematics. Algebra I (PreAP, Pt. 1, Pt. 2) Curriculum Guide. Revised 2016

Mathematics. Algebra I (PreAP, Pt. 1, Pt. 2) Curriculum Guide. Revised 2016 Mathematics Algebra I (PreAP, Pt. 1, Pt. ) Curriculum Guide Revised 016 Intentionally Left Blank Introduction The Mathematics Curriculum Guide serves as a guide for teachers when planning instruction and

More information

ALGEBRAIC GEOMETRY COURSE NOTES, LECTURE 2: HILBERT S NULLSTELLENSATZ.

ALGEBRAIC GEOMETRY COURSE NOTES, LECTURE 2: HILBERT S NULLSTELLENSATZ. ALGEBRAIC GEOMETRY COURSE NOTES, LECTURE 2: HILBERT S NULLSTELLENSATZ. ANDREW SALCH 1. Hilbert s Nullstellensatz. The last lecture left off with the claim that, if J k[x 1,..., x n ] is an ideal, then

More information

ADVANCED TOPICS IN ALGEBRAIC GEOMETRY

ADVANCED TOPICS IN ALGEBRAIC GEOMETRY ADVANCED TOPICS IN ALGEBRAIC GEOMETRY DAVID WHITE Outline of talk: My goal is to introduce a few more advanced topics in algebraic geometry but not to go into too much detail. This will be a survey of

More information

PRIMARY DECOMPOSITION FOR THE INTERSECTION AXIOM

PRIMARY DECOMPOSITION FOR THE INTERSECTION AXIOM PRIMARY DECOMPOSITION FOR THE INTERSECTION AXIOM ALEX FINK 1. Introduction and background Consider the discrete conditional independence model M given by {X 1 X 2 X 3, X 1 X 3 X 2 }. The intersection axiom

More information

ACCRS/QUALITY CORE CORRELATION DOCUMENT: ALGEBRA I

ACCRS/QUALITY CORE CORRELATION DOCUMENT: ALGEBRA I ACCRS/QUALITY CORE CORRELATION DOCUMENT: ALGEBRA I Revised March 25, 2013 Extend the properties of exponents to rational exponents. 1. [N-RN1] Explain how the definition of the meaning of rational exponents

More information

CHAPTER 0 PRELIMINARY MATERIAL. Paul Vojta. University of California, Berkeley. 18 February 1998

CHAPTER 0 PRELIMINARY MATERIAL. Paul Vojta. University of California, Berkeley. 18 February 1998 CHAPTER 0 PRELIMINARY MATERIAL Paul Vojta University of California, Berkeley 18 February 1998 This chapter gives some preliminary material on number theory and algebraic geometry. Section 1 gives basic

More information

GROUP THEORY PRIMER. New terms: tensor, rank k tensor, Young tableau, Young diagram, hook, hook length, factors over hooks rule

GROUP THEORY PRIMER. New terms: tensor, rank k tensor, Young tableau, Young diagram, hook, hook length, factors over hooks rule GROUP THEORY PRIMER New terms: tensor, rank k tensor, Young tableau, Young diagram, hook, hook length, factors over hooks rule 1. Tensor methods for su(n) To study some aspects of representations of a

More information

Linear Programming: Simplex

Linear Programming: Simplex Linear Programming: Simplex Stephen J. Wright 1 2 Computer Sciences Department, University of Wisconsin-Madison. IMA, August 2016 Stephen Wright (UW-Madison) Linear Programming: Simplex IMA, August 2016

More information

Organization Team Team ID#

Organization Team Team ID# 1. [4] A random number generator will always output 7. Sam uses this random number generator once. What is the expected value of the output? 2. [4] Let A, B, C, D, E, F be 6 points on a circle in that

More information

ABSTRACT. Department of Mathematics. interesting results. A graph on n vertices is represented by a polynomial in n

ABSTRACT. Department of Mathematics. interesting results. A graph on n vertices is represented by a polynomial in n ABSTRACT Title of Thesis: GRÖBNER BASES WITH APPLICATIONS IN GRAPH THEORY Degree candidate: Angela M. Hennessy Degree and year: Master of Arts, 2006 Thesis directed by: Professor Lawrence C. Washington

More information

A connection between number theory and linear algebra

A connection between number theory and linear algebra A connection between number theory and linear algebra Mark Steinberger Contents 1. Some basics 1 2. Rational canonical form 2 3. Prime factorization in F[x] 4 4. Units and order 5 5. Finite fields 7 6.

More information

6-1 The Positivstellensatz P. Parrilo and S. Lall, ECC

6-1 The Positivstellensatz P. Parrilo and S. Lall, ECC 6-1 The Positivstellensatz P. Parrilo and S. Lall, ECC 2003 2003.09.02.10 6. The Positivstellensatz Basic semialgebraic sets Semialgebraic sets Tarski-Seidenberg and quantifier elimination Feasibility

More information

West Windsor-Plainsboro Regional School District Algebra Grade 8

West Windsor-Plainsboro Regional School District Algebra Grade 8 West Windsor-Plainsboro Regional School District Algebra Grade 8 Content Area: Mathematics Unit 1: Foundations of Algebra This unit involves the study of real numbers and the language of algebra. Using

More information

1 Adeles over Q. 1.1 Absolute values

1 Adeles over Q. 1.1 Absolute values 1 Adeles over Q 1.1 Absolute values Definition 1.1.1 (Absolute value) An absolute value on a field F is a nonnegative real valued function on F which satisfies the conditions: (i) x = 0 if and only if

More information

SUMS PROBLEM COMPETITION, 2000

SUMS PROBLEM COMPETITION, 2000 SUMS ROBLEM COMETITION, 2000 SOLUTIONS 1 The result is well known, and called Morley s Theorem Many proofs are known See for example HSM Coxeter, Introduction to Geometry, page 23 2 If the number of vertices,

More information

Oeding (Auburn) tensors of rank 5 December 15, / 24

Oeding (Auburn) tensors of rank 5 December 15, / 24 Oeding (Auburn) 2 2 2 2 2 tensors of rank 5 December 15, 2015 1 / 24 Recall Peter Burgisser s overview lecture (Jan Draisma s SIAM News article). Big Goal: Bound the computational complexity of det n,

More information

Algebra Performance Level Descriptors

Algebra Performance Level Descriptors Limited A student performing at the Limited Level demonstrates a minimal command of Ohio s Learning Standards for Algebra. A student at this level has an emerging ability to A student whose performance

More information

Tensors. Notes by Mateusz Michalek and Bernd Sturmfels for the lecture on June 5, 2018, in the IMPRS Ringvorlesung Introduction to Nonlinear Algebra

Tensors. Notes by Mateusz Michalek and Bernd Sturmfels for the lecture on June 5, 2018, in the IMPRS Ringvorlesung Introduction to Nonlinear Algebra Tensors Notes by Mateusz Michalek and Bernd Sturmfels for the lecture on June 5, 2018, in the IMPRS Ringvorlesung Introduction to Nonlinear Algebra This lecture is divided into two parts. The first part,

More information

Institutionen för matematik, KTH.

Institutionen för matematik, KTH. Institutionen för matematik, KTH. Contents 7 Affine Varieties 1 7.1 The polynomial ring....................... 1 7.2 Hypersurfaces........................... 1 7.3 Ideals...............................

More information

ALGEBRA I. 2. Rewrite expressions involving radicals and rational exponents using the properties of exponents. (N-RN2)

ALGEBRA I. 2. Rewrite expressions involving radicals and rational exponents using the properties of exponents. (N-RN2) ALGEBRA I The Algebra I course builds on foundational mathematical content learned by students in Grades K-8 by expanding mathematics understanding to provide students with a strong mathematics education.

More information

Chapter 9: Systems of Equations and Inequalities

Chapter 9: Systems of Equations and Inequalities Chapter 9: Systems of Equations and Inequalities 9. Systems of Equations Solve the system of equations below. By this we mean, find pair(s) of numbers (x, y) (if possible) that satisfy both equations.

More information

Maximum Likelihood Estimation in Latent Class Models for Contingency Table Data

Maximum Likelihood Estimation in Latent Class Models for Contingency Table Data Maximum Likelihood Estimation in Latent Class Models for Contingency Table Data Stephen E. Fienberg Department of Statistics, Machine Learning Department, Cylab Carnegie Mellon University May 20, 2008

More information

A new parametrization for binary hidden Markov modes

A new parametrization for binary hidden Markov modes A new parametrization for binary hidden Markov models Andrew Critch, UC Berkeley at Pennsylvania State University June 11, 2012 See Binary hidden Markov models and varieties [, 2012], arxiv:1206.0500,

More information

Solutions of exercise sheet 8

Solutions of exercise sheet 8 D-MATH Algebra I HS 14 Prof. Emmanuel Kowalski Solutions of exercise sheet 8 1. In this exercise, we will give a characterization for solvable groups using commutator subgroups. See last semester s (Algebra

More information

Algebra 1 Standards Curriculum Map Bourbon County Schools. Days Unit/Topic Standards Activities Learning Targets ( I Can Statements) 1-19 Unit 1

Algebra 1 Standards Curriculum Map Bourbon County Schools. Days Unit/Topic Standards Activities Learning Targets ( I Can Statements) 1-19 Unit 1 Algebra 1 Standards Curriculum Map Bourbon County Schools Level: Grade and/or Course: Updated: e.g. = Example only Days Unit/Topic Standards Activities Learning Targets ( I 1-19 Unit 1 A.SSE.1 Interpret

More information

PERIODIC POINTS OF THE FAMILY OF TENT MAPS

PERIODIC POINTS OF THE FAMILY OF TENT MAPS PERIODIC POINTS OF THE FAMILY OF TENT MAPS ROBERTO HASFURA-B. AND PHILLIP LYNCH 1. INTRODUCTION. Of interest in this article is the dynamical behavior of the one-parameter family of maps T (x) = (1/2 x

More information

REPRESENTATION THEORY OF S n

REPRESENTATION THEORY OF S n REPRESENTATION THEORY OF S n EVAN JENKINS Abstract. These are notes from three lectures given in MATH 26700, Introduction to Representation Theory of Finite Groups, at the University of Chicago in November

More information

WA State Common Core Standards - Mathematics

WA State Common Core Standards - Mathematics Number & Quantity The Real Number System Extend the properties of exponents to rational exponents. 1. Explain how the definition of the meaning of rational exponents follows from extending the properties

More information

Subject Algebra 1 Unit 1 Relationships between Quantities and Reasoning with Equations

Subject Algebra 1 Unit 1 Relationships between Quantities and Reasoning with Equations Subject Algebra 1 Unit 1 Relationships between Quantities and Reasoning with Equations Time Frame: Description: Work with expressions and equations through understanding quantities and the relationships

More information

Mathematics Standards for High School Algebra I

Mathematics Standards for High School Algebra I Mathematics Standards for High School Algebra I Algebra I is a course required for graduation and course is aligned with the College and Career Ready Standards for Mathematics in High School. Throughout

More information

E. GORLA, J. C. MIGLIORE, AND U. NAGEL

E. GORLA, J. C. MIGLIORE, AND U. NAGEL GRÖBNER BASES VIA LINKAGE E. GORLA, J. C. MIGLIORE, AND U. NAGEL Abstract. In this paper, we give a sufficient condition for a set G of polynomials to be a Gröbner basis with respect to a given term-order

More information

Standard Description Agile Mind Lesson / Activity Page / Link to Resource

Standard Description Agile Mind Lesson / Activity Page / Link to Resource Publisher: Agile Mind, Inc Date: 19-May-14 Course and/or Algebra I Grade Level: TN Core Standard Standard Description Agile Mind Lesson / Activity Page / Link to Resource Create equations that describe

More information

Lecture 9 Classification of States

Lecture 9 Classification of States Lecture 9: Classification of States of 27 Course: M32K Intro to Stochastic Processes Term: Fall 204 Instructor: Gordan Zitkovic Lecture 9 Classification of States There will be a lot of definitions and

More information

The Generalized Neighbor Joining method

The Generalized Neighbor Joining method The Generalized Neighbor Joining method Ruriko Yoshida Dept. of Mathematics Duke University Joint work with Dan Levy and Lior Pachter www.math.duke.edu/ ruriko data mining 1 Challenge We would like to

More information

16.2. Definition. Let N be the set of all nilpotent elements in g. Define N

16.2. Definition. Let N be the set of all nilpotent elements in g. Define N 74 16. Lecture 16: Springer Representations 16.1. The flag manifold. Let G = SL n (C). It acts transitively on the set F of complete flags 0 F 1 F n 1 C n and the stabilizer of the standard flag is the

More information

Tropical decomposition of symmetric tensors

Tropical decomposition of symmetric tensors Tropical decomposition of symmetric tensors Melody Chan University of California, Berkeley mtchan@math.berkeley.edu December 11, 008 1 Introduction In [], Comon et al. give an algorithm for decomposing

More information

Course 311: Michaelmas Term 2005 Part III: Topics in Commutative Algebra

Course 311: Michaelmas Term 2005 Part III: Topics in Commutative Algebra Course 311: Michaelmas Term 2005 Part III: Topics in Commutative Algebra D. R. Wilkins Contents 3 Topics in Commutative Algebra 2 3.1 Rings and Fields......................... 2 3.2 Ideals...............................

More information

FLORIDA STANDARDS TO BOOK CORRELATION

FLORIDA STANDARDS TO BOOK CORRELATION FLORIDA STANDARDS TO BOOK CORRELATION Florida Standards (MAFS.912) Conceptual Category: Number and Quantity Domain: The Real Number System After a standard is introduced, it is revisited many times in

More information

Matrices and systems of linear equations

Matrices and systems of linear equations Matrices and systems of linear equations Samy Tindel Purdue University Differential equations and linear algebra - MA 262 Taken from Differential equations and linear algebra by Goode and Annin Samy T.

More information

Algebra , Martin-Gay

Algebra , Martin-Gay A Correlation of Algebra 1 2016, to the Common Core State Standards for Mathematics - Algebra I Introduction This document demonstrates how Pearson s High School Series by Elayn, 2016, meets the standards

More information

An Introduction to Transversal Matroids

An Introduction to Transversal Matroids An Introduction to Transversal Matroids Joseph E Bonin The George Washington University These slides and an accompanying expository paper (in essence, notes for this talk, and more) are available at http://homegwuedu/

More information

Algebra I, Common Core Correlation Document

Algebra I, Common Core Correlation Document Resource Title: Publisher: 1 st Year Algebra (MTHH031060 and MTHH032060) University of Nebraska High School Algebra I, Common Core Correlation Document Indicates a modeling standard linking mathematics

More information

11. Dimension. 96 Andreas Gathmann

11. Dimension. 96 Andreas Gathmann 96 Andreas Gathmann 11. Dimension We have already met several situations in this course in which it seemed to be desirable to have a notion of dimension (of a variety, or more generally of a ring): for

More information

Chapter 1 Vector Spaces

Chapter 1 Vector Spaces Chapter 1 Vector Spaces Per-Olof Persson persson@berkeley.edu Department of Mathematics University of California, Berkeley Math 110 Linear Algebra Vector Spaces Definition A vector space V over a field

More information

MIT Algebraic techniques and semidefinite optimization May 9, Lecture 21. Lecturer: Pablo A. Parrilo Scribe:???

MIT Algebraic techniques and semidefinite optimization May 9, Lecture 21. Lecturer: Pablo A. Parrilo Scribe:??? MIT 6.972 Algebraic techniques and semidefinite optimization May 9, 2006 Lecture 2 Lecturer: Pablo A. Parrilo Scribe:??? In this lecture we study techniques to exploit the symmetry that can be present

More information

Algebra 1 Mathematics: to Hoover City Schools

Algebra 1 Mathematics: to Hoover City Schools Jump to Scope and Sequence Map Units of Study Correlation of Standards Special Notes Scope and Sequence Map Conceptual Categories, Domains, Content Clusters, & Standard Numbers NUMBER AND QUANTITY (N)

More information

Ph.D. Qualifying Exam: Algebra I

Ph.D. Qualifying Exam: Algebra I Ph.D. Qualifying Exam: Algebra I 1. Let F q be the finite field of order q. Let G = GL n (F q ), which is the group of n n invertible matrices with the entries in F q. Compute the order of the group G

More information

Parameterizing orbits in flag varieties

Parameterizing orbits in flag varieties Parameterizing orbits in flag varieties W. Ethan Duckworth April 2008 Abstract In this document we parameterize the orbits of certain groups acting on partial flag varieties with finitely many orbits.

More information

Lecture 20 : Markov Chains

Lecture 20 : Markov Chains CSCI 3560 Probability and Computing Instructor: Bogdan Chlebus Lecture 0 : Markov Chains We consider stochastic processes. A process represents a system that evolves through incremental changes called

More information

ALGEBRA I CCR MATH STANDARDS

ALGEBRA I CCR MATH STANDARDS RELATIONSHIPS BETWEEN QUANTITIES AND REASONING WITH EQUATIONS M.A1HS.1 M.A1HS.2 M.A1HS.3 M.A1HS.4 M.A1HS.5 M.A1HS.6 M.A1HS.7 M.A1HS.8 M.A1HS.9 M.A1HS.10 Reason quantitatively and use units to solve problems.

More information

Permutation groups/1. 1 Automorphism groups, permutation groups, abstract

Permutation groups/1. 1 Automorphism groups, permutation groups, abstract Permutation groups Whatever you have to do with a structure-endowed entity Σ try to determine its group of automorphisms... You can expect to gain a deep insight into the constitution of Σ in this way.

More information

Chapter 1. Preliminaries

Chapter 1. Preliminaries Introduction This dissertation is a reading of chapter 4 in part I of the book : Integer and Combinatorial Optimization by George L. Nemhauser & Laurence A. Wolsey. The chapter elaborates links between

More information

Math 121 Homework 5: Notes on Selected Problems

Math 121 Homework 5: Notes on Selected Problems Math 121 Homework 5: Notes on Selected Problems 12.1.2. Let M be a module over the integral domain R. (a) Assume that M has rank n and that x 1,..., x n is any maximal set of linearly independent elements

More information

Algebra I Number and Quantity The Real Number System (N-RN)

Algebra I Number and Quantity The Real Number System (N-RN) Number and Quantity The Real Number System (N-RN) Use properties of rational and irrational numbers N-RN.3 Explain why the sum or product of two rational numbers is rational; that the sum of a rational

More information

0 Sets and Induction. Sets

0 Sets and Induction. Sets 0 Sets and Induction Sets A set is an unordered collection of objects, called elements or members of the set. A set is said to contain its elements. We write a A to denote that a is an element of the set

More information

2. Intersection Multiplicities

2. Intersection Multiplicities 2. Intersection Multiplicities 11 2. Intersection Multiplicities Let us start our study of curves by introducing the concept of intersection multiplicity, which will be central throughout these notes.

More information

Common Core State Standards with California Additions 1 Standards Map. Algebra I

Common Core State Standards with California Additions 1 Standards Map. Algebra I Common Core State s with California Additions 1 s Map Algebra I *Indicates a modeling standard linking mathematics to everyday life, work, and decision-making N-RN 1. N-RN 2. Publisher Language 2 Primary

More information

Maximum Likelihood Estimates for Binary Random Variables on Trees via Phylogenetic Ideals

Maximum Likelihood Estimates for Binary Random Variables on Trees via Phylogenetic Ideals Maximum Likelihood Estimates for Binary Random Variables on Trees via Phylogenetic Ideals Robin Evans Abstract In their 2007 paper, E.S. Allman and J.A. Rhodes characterise the phylogenetic ideal of general

More information

Symmetries and Polynomials

Symmetries and Polynomials Symmetries and Polynomials Aaron Landesman and Apurva Nakade June 30, 2018 Introduction In this class we ll learn how to solve a cubic. We ll also sketch how to solve a quartic. We ll explore the connections

More information

1 Fields and vector spaces

1 Fields and vector spaces 1 Fields and vector spaces In this section we revise some algebraic preliminaries and establish notation. 1.1 Division rings and fields A division ring, or skew field, is a structure F with two binary

More information

College Algebra with Corequisite Support: Targeted Review

College Algebra with Corequisite Support: Targeted Review College Algebra with Corequisite Support: Targeted Review 978-1-63545-056-9 To learn more about all our offerings Visit Knewtonalta.com Source Author(s) (Text or Video) Title(s) Link (where applicable)

More information

Quivers of Period 2. Mariya Sardarli Max Wimberley Heyi Zhu. November 26, 2014

Quivers of Period 2. Mariya Sardarli Max Wimberley Heyi Zhu. November 26, 2014 Quivers of Period 2 Mariya Sardarli Max Wimberley Heyi Zhu ovember 26, 2014 Abstract A quiver with vertices labeled from 1,..., n is said to have period 2 if the quiver obtained by mutating at 1 and then

More information

MATH 223A NOTES 2011 LIE ALGEBRAS 35

MATH 223A NOTES 2011 LIE ALGEBRAS 35 MATH 3A NOTES 011 LIE ALGEBRAS 35 9. Abstract root systems We now attempt to reconstruct the Lie algebra based only on the information given by the set of roots Φ which is embedded in Euclidean space E.

More information

YOUNG TABLEAUX AND THE REPRESENTATIONS OF THE SYMMETRIC GROUP

YOUNG TABLEAUX AND THE REPRESENTATIONS OF THE SYMMETRIC GROUP YOUNG TABLEAUX AND THE REPRESENTATIONS OF THE SYMMETRIC GROUP YUFEI ZHAO ABSTRACT We explore an intimate connection between Young tableaux and representations of the symmetric group We describe the construction

More information

Codewords of small weight in the (dual) code of points and k-spaces of P G(n, q)

Codewords of small weight in the (dual) code of points and k-spaces of P G(n, q) Codewords of small weight in the (dual) code of points and k-spaces of P G(n, q) M. Lavrauw L. Storme G. Van de Voorde October 4, 2007 Abstract In this paper, we study the p-ary linear code C k (n, q),

More information

Linear Classification: Linear Programming

Linear Classification: Linear Programming Linear Classification: Linear Programming Yufei Tao Department of Computer Science and Engineering Chinese University of Hong Kong 1 / 21 Y Tao Linear Classification: Linear Programming Recall the definition

More information

COMMON CORE STATE STANDARDS TO BOOK CORRELATION

COMMON CORE STATE STANDARDS TO BOOK CORRELATION COMMON CORE STATE STANDARDS TO BOOK CORRELATION Conceptual Category: Number and Quantity Domain: The Real Number System After a standard is introduced, it is revisited many times in subsequent activities,

More information

CHAPTER 1. AFFINE ALGEBRAIC VARIETIES

CHAPTER 1. AFFINE ALGEBRAIC VARIETIES CHAPTER 1. AFFINE ALGEBRAIC VARIETIES During this first part of the course, we will establish a correspondence between various geometric notions and algebraic ones. Some references for this part of the

More information

Central Groupoids, Central Digraphs, and Zero-One Matrices A Satisfying A 2 = J

Central Groupoids, Central Digraphs, and Zero-One Matrices A Satisfying A 2 = J Central Groupoids, Central Digraphs, and Zero-One Matrices A Satisfying A 2 = J Frank Curtis, John Drew, Chi-Kwong Li, and Daniel Pragel September 25, 2003 Abstract We study central groupoids, central

More information

LMI MODELLING 4. CONVEX LMI MODELLING. Didier HENRION. LAAS-CNRS Toulouse, FR Czech Tech Univ Prague, CZ. Universidad de Valladolid, SP March 2009

LMI MODELLING 4. CONVEX LMI MODELLING. Didier HENRION. LAAS-CNRS Toulouse, FR Czech Tech Univ Prague, CZ. Universidad de Valladolid, SP March 2009 LMI MODELLING 4. CONVEX LMI MODELLING Didier HENRION LAAS-CNRS Toulouse, FR Czech Tech Univ Prague, CZ Universidad de Valladolid, SP March 2009 Minors A minor of a matrix F is the determinant of a submatrix

More information

Mathematics. Number and Quantity The Real Number System

Mathematics. Number and Quantity The Real Number System Number and Quantity The Real Number System Extend the properties of exponents to rational exponents. 1. Explain how the definition of the meaning of rational exponents follows from extending the properties

More information

Linear Algebra. The analysis of many models in the social sciences reduces to the study of systems of equations.

Linear Algebra. The analysis of many models in the social sciences reduces to the study of systems of equations. POLI 7 - Mathematical and Statistical Foundations Prof S Saiegh Fall Lecture Notes - Class 4 October 4, Linear Algebra The analysis of many models in the social sciences reduces to the study of systems

More information

Algebra 1 3 rd Trimester Expectations Chapter (McGraw-Hill Algebra 1) Chapter 9: Quadratic Functions and Equations. Key Vocabulary Suggested Pacing

Algebra 1 3 rd Trimester Expectations Chapter (McGraw-Hill Algebra 1) Chapter 9: Quadratic Functions and Equations. Key Vocabulary Suggested Pacing Algebra 1 3 rd Trimester Expectations Chapter (McGraw-Hill Algebra 1) Chapter 9: Quadratic Functions and Equations Lesson 9-1: Graphing Quadratic Functions Lesson 9-2: Solving Quadratic Equations by Graphing

More information

Math 676. A compactness theorem for the idele group. and by the product formula it lies in the kernel (A K )1 of the continuous idelic norm

Math 676. A compactness theorem for the idele group. and by the product formula it lies in the kernel (A K )1 of the continuous idelic norm Math 676. A compactness theorem for the idele group 1. Introduction Let K be a global field, so K is naturally a discrete subgroup of the idele group A K and by the product formula it lies in the kernel

More information

Lebesgue Measure on R n

Lebesgue Measure on R n CHAPTER 2 Lebesgue Measure on R n Our goal is to construct a notion of the volume, or Lebesgue measure, of rather general subsets of R n that reduces to the usual volume of elementary geometrical sets

More information

MULTIPLICITIES OF MONOMIAL IDEALS

MULTIPLICITIES OF MONOMIAL IDEALS MULTIPLICITIES OF MONOMIAL IDEALS JÜRGEN HERZOG AND HEMA SRINIVASAN Introduction Let S = K[x 1 x n ] be a polynomial ring over a field K with standard grading, I S a graded ideal. The multiplicity of S/I

More information

HOMOLOGY THEORIES INGRID STARKEY

HOMOLOGY THEORIES INGRID STARKEY HOMOLOGY THEORIES INGRID STARKEY Abstract. This paper will introduce the notion of homology for topological spaces and discuss its intuitive meaning. It will also describe a general method that is used

More information

Math Camp Lecture 4: Linear Algebra. Xiao Yu Wang. Aug 2010 MIT. Xiao Yu Wang (MIT) Math Camp /10 1 / 88

Math Camp Lecture 4: Linear Algebra. Xiao Yu Wang. Aug 2010 MIT. Xiao Yu Wang (MIT) Math Camp /10 1 / 88 Math Camp 2010 Lecture 4: Linear Algebra Xiao Yu Wang MIT Aug 2010 Xiao Yu Wang (MIT) Math Camp 2010 08/10 1 / 88 Linear Algebra Game Plan Vector Spaces Linear Transformations and Matrices Determinant

More information

1 The linear algebra of linear programs (March 15 and 22, 2015)

1 The linear algebra of linear programs (March 15 and 22, 2015) 1 The linear algebra of linear programs (March 15 and 22, 2015) Many optimization problems can be formulated as linear programs. The main features of a linear program are the following: Variables are real

More information

Algebraic matroids are almost entropic

Algebraic matroids are almost entropic accepted to Proceedings of the AMS June 28, 2017 Algebraic matroids are almost entropic František Matúš Abstract. Algebraic matroids capture properties of the algebraic dependence among elements of extension

More information

Compatibly split subvarieties of Hilb n (A 2 k)

Compatibly split subvarieties of Hilb n (A 2 k) Compatibly split subvarieties of Hilb n (A 2 k) Jenna Rajchgot MSRI Combinatorial Commutative Algebra December 3-7, 2012 Throughout this talk, let k be an algebraically closed field of characteristic p

More information

The Advantage Testing Foundation Solutions

The Advantage Testing Foundation Solutions The Advantage Testing Foundation 2016 Problem 1 Let T be a triangle with side lengths 3, 4, and 5. If P is a point in or on T, what is the greatest possible sum of the distances from P to each of the three

More information

How well do I know the content? (scale 1 5)

How well do I know the content? (scale 1 5) Page 1 I. Number and Quantity, Algebra, Functions, and Calculus (68%) A. Number and Quantity 1. Understand the properties of exponents of s I will a. perform operations involving exponents, including negative

More information

GRE Subject test preparation Spring 2016 Topic: Abstract Algebra, Linear Algebra, Number Theory.

GRE Subject test preparation Spring 2016 Topic: Abstract Algebra, Linear Algebra, Number Theory. GRE Subject test preparation Spring 2016 Topic: Abstract Algebra, Linear Algebra, Number Theory. Linear Algebra Standard matrix manipulation to compute the kernel, intersection of subspaces, column spaces,

More information

INITIAL COMPLEX ASSOCIATED TO A JET SCHEME OF A DETERMINANTAL VARIETY. the affine space of dimension k over F. By a variety in A k F

INITIAL COMPLEX ASSOCIATED TO A JET SCHEME OF A DETERMINANTAL VARIETY. the affine space of dimension k over F. By a variety in A k F INITIAL COMPLEX ASSOCIATED TO A JET SCHEME OF A DETERMINANTAL VARIETY BOYAN JONOV Abstract. We show in this paper that the principal component of the first order jet scheme over the classical determinantal

More information

Optimization. The value x is called a maximizer of f and is written argmax X f. g(λx + (1 λ)y) < λg(x) + (1 λ)g(y) 0 < λ < 1; x, y X.

Optimization. The value x is called a maximizer of f and is written argmax X f. g(λx + (1 λ)y) < λg(x) + (1 λ)g(y) 0 < λ < 1; x, y X. Optimization Background: Problem: given a function f(x) defined on X, find x such that f(x ) f(x) for all x X. The value x is called a maximizer of f and is written argmax X f. In general, argmax X f may

More information

Huntington Beach City School District Grade 8 Mathematics Accelerated Standards Schedule

Huntington Beach City School District Grade 8 Mathematics Accelerated Standards Schedule Huntington Beach City School District Grade 8 Mathematics Accelerated Standards Schedule 2016-2017 Interim Assessment Schedule Orange Interim Assessment: November 1-18, 2016 Green Interim Assessment: January

More information

2. TRIGONOMETRY 3. COORDINATEGEOMETRY: TWO DIMENSIONS

2. TRIGONOMETRY 3. COORDINATEGEOMETRY: TWO DIMENSIONS 1 TEACHERS RECRUITMENT BOARD, TRIPURA (TRBT) EDUCATION (SCHOOL) DEPARTMENT, GOVT. OF TRIPURA SYLLABUS: MATHEMATICS (MCQs OF 150 MARKS) SELECTION TEST FOR POST GRADUATE TEACHER(STPGT): 2016 1. ALGEBRA Sets:

More information

Optimization Theory. A Concise Introduction. Jiongmin Yong

Optimization Theory. A Concise Introduction. Jiongmin Yong October 11, 017 16:5 ws-book9x6 Book Title Optimization Theory 017-08-Lecture Notes page 1 1 Optimization Theory A Concise Introduction Jiongmin Yong Optimization Theory 017-08-Lecture Notes page Optimization

More information

A Potpourri of Nonlinear Algebra

A Potpourri of Nonlinear Algebra a a 2 a 3 + c c 2 c 3 =2, a a 3 b 2 + c c 3 d 2 =0, a 2 a 3 b + c 2 c 3 d =0, a 3 b b 2 + c 3 d d 2 = 4, a a 2 b 3 + c c 2 d 3 =0, a b 2 b 3 + c d 2 d 3 = 4, a 2 b b 3 + c 2 d 3 d =4, b b 2 b 3 + d d 2

More information