Markov basis for testing homogeneity of Markov chains

Similar documents
MATHEMATICAL ENGINEERING TECHNICAL REPORTS. Markov Bases for Two-way Subtable Sum Problems

Minimal basis for connected Markov chain over 3 3 K contingency tables with fixed two-dimensional marginals. Satoshi AOKI and Akimichi TAKEMURA

On a perturbation method for determining group of invariance of hierarchical models

Toric Fiber Products

A Markov Basis for Conditional Test of Common Diagonal Effect in Quasi-Independence Model for Square Contingency Tables

Markov bases and subbases for bounded contingency tables

Discrete methods for statistical network analysis: Exact tests for goodness of fit for network data Key player: sampling algorithms

MATHEMATICAL ENGINEERING TECHNICAL REPORTS. Boundary cliques, clique trees and perfect sequences of maximal cliques of a chordal graph

Distance-reducing Markov bases for sampling from a discrete sample space

Symmetries and Polynomials

SYMMETRIES IN R 3 NAMITA GUPTA

Indispensable monomials of toric ideals and Markov bases

3.2 Configuration model

Markov Chains. X(t) is a Markov Process if, for arbitrary times t 1 < t 2 <... < t k < t k+1. If X(t) is discrete-valued. If X(t) is continuous-valued

Notes 6 : First and second moment methods

An Introduction to Exponential-Family Random Graph Models

Lecture 4: September Reminder: convergence of sequences

1 Ways to Describe a Stochastic Process

6 Evolution of Networks

Introduction to Machine Learning CMU-10701

30 Classification of States

1. what conditional independencies are implied by the graph. 2. whether these independecies correspond to the probability distribution

Stat 516, Homework 1

Markov Chains, Stochastic Processes, and Matrix Decompositions

Markov chains. Randomness and Computation. Markov chains. Markov processes

Markov Chain Monte Carlo The Metropolis-Hastings Algorithm

Example: physical systems. If the state space. Example: speech recognition. Context can be. Example: epidemics. Suppose each infected

Convergence Rate of Markov Chains

Math 396. Quotient spaces

Computer Engineering II Solution to Exercise Sheet Chapter 12

CPSC 540: Machine Learning

The Markov Chain Monte Carlo Method

MARKOV CHAIN MONTE CARLO

Discrete time Markov chains. Discrete Time Markov Chains, Definition and classification. Probability axioms and first results

Chapter 2 Direct Current Circuits

k times l times n times

Lecture 20 : Markov Chains

arxiv: v6 [math.nt] 10 Oct 2018

2. Intersection Multiplicities

PAIRS OF SUCCESSES IN BERNOULLI TRIALS AND A NEW n-estimator FOR THE BINOMIAL DISTRIBUTION

F (x) = P [X x[. DF1 F is nondecreasing. DF2 F is right-continuous

18.6 Regression and Classification with Linear Models

Linear Algebra. The analysis of many models in the social sciences reduces to the study of systems of equations.

Markov Chains Handout for Stat 110

0 0 0 # In other words it must saisfy the condition that the successive rst non-zero entries in each row are indented.

Markov Chains. Arnoldo Frigessi Bernd Heidergott November 4, 2015

University of Chicago Autumn 2003 CS Markov Chain Monte Carlo Methods

Uniform generation of random graphs with power-law degree sequences

CS 798: Homework Assignment 3 (Queueing Theory)

CS6964: Notes On Linear Systems

THE REPRESENTATION THEORY, GEOMETRY, AND COMBINATORICS OF BRANCHED COVERS

(1)(a) V = 2n-dimensional vector space over a field F, (1)(b) B = non-degenerate symplectic form on V.

The Metropolis-Hastings Algorithm. June 8, 2012

Markov Chains. Andreas Klappenecker by Andreas Klappenecker. All rights reserved. Texas A&M University

Robert Collins CSE586, PSU Intro to Sampling Methods

JUSTIN HARTMANN. F n Σ.

Oriented majority-vote model in social dynamics

Advanced Automata Theory 9 Automatic Structures in General

An Efficient Calculation Algorithm in Continuous-Time Markov Analysis Using Large-Scale Numerical Calculation

arxiv: v1 [math.co] 17 Dec 2007

Exact Goodness-of-Fit Testing for the Ising Model

Approximate Inference

Graphic sequences, adjacency matrix

arxiv: v1 [math.pr] 26 Mar 2008

SOMEWHAT STOCHASTIC MATRICES

Monte Carlo Methods. Leon Gu CSD, CMU

Math 31 Lesson Plan. Day 22: Tying Up Loose Ends. Elizabeth Gillaspy. October 31, Supplies needed: Colored chalk.

PRIMARY DECOMPOSITION FOR THE INTERSECTION AXIOM

Machine Learning for Data Science (CS4786) Lecture 24

Linear equations in linear algebra

Mathematical Analysis on the HP-48. [Reference Code: C78]

Overview of classical mirror symmetry

A note on the Isomorphism Problem for Monomial Digraphs

Definition: A binary relation R from a set A to a set B is a subset R A B. Example:

(x 1 +x 2 )(x 1 x 2 )+(x 2 +x 3 )(x 2 x 3 )+(x 3 +x 1 )(x 3 x 1 ).

Local MAX-CUT in Smoothed Polynomial Time

Advanced Topics in Probability

Improving the Asymptotic Performance of Markov Chain Monte-Carlo by Inserting Vortices

Markov Chains and Stochastic Sampling

LINEAR ALGEBRA W W L CHEN

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling

arxiv:math/ v1 [math.ac] 11 Jul 2006

1 Last time: determinants

2 Mathematical Model, Sequential Probability Ratio Test, Distortions

STOCHASTIC PROCESSES Basic notions

Eigenvalues and Eigenvectors

On the principle of smooth fit in optimal stopping problems

Petr Volf. Model for Difference of Two Series of Poisson-like Count Data

Use of Asymptotics for Holonomic Gradient Method

10.4 The Kruskal Katona theorem

Consensus Problems on Small World Graphs: A Structural Study

Introduction to Vectors

CS 781 Lecture 9 March 10, 2011 Topics: Local Search and Optimization Metropolis Algorithm Greedy Optimization Hopfield Networks Max Cut Problem Nash

Markov Chain Monte Carlo Methods

Artificial Intelligence

Binary Operations. Chapter Groupoids, Semigroups, Monoids

Robert Collins CSE586, PSU Intro to Sampling Methods

LECTURE 5, FRIDAY

Designing Information Devices and Systems I Fall 2018 Lecture Notes Note Introduction: Op-amps in Negative Feedback

Exam in Discrete Mathematics

Transcription:

Markov basis for testing homogeneity of Markov chains Akimichi Takemura and Hisayuki Hara University of Tokyo July 6, 2009 Takemura and Hara (Univ. of Tokyo) Testing homogeneity of Markov chains July 6, 2009 1 / 39

Table of contents 1 Introduction and Notation 2 Properties of the configuration and the toric ideal 3 MB for two-state case with arbitrary length 4 MB for arbitrary finite state space and length of three 5 MB for three-state, length of four 6 Discussions and conjectures Takemura and Hara (Univ. of Tokyo) Testing homogeneity of Markov chains July 6, 2009 2 / 39

Introduction and Notation Introduction and Notation Takemura and Hara (Univ. of Tokyo) Testing homogeneity of Markov chains July 6, 2009 3 / 39

Introduction and Notation Introduction and Notation We study Markov basis for testing homogeneity of Markov chains. Markov chain models are used in various fields, such as models for social mobility studies or state switching models in econometrics. We give a complete description of Markov basis for the following cases: two-state, arbitrary length. arbitrary finite state space and length of three. three-state, length of four (already very hard!) The general case remains to be a conjecture. Takemura and Hara (Univ. of Tokyo) Testing homogeneity of Markov chains July 6, 2009 4 / 39

Introduction and Notation Introduction and Notation Consider a Markov chain X t, t = 1,...,T ( 3), over a finite state space S = {1,...,S} (S 2). Each observed Markov chain is a path ω = (s 1,...,s T ) S T. 1 1 2... T 2. S Takemura and Hara (Univ. of Tokyo) Testing homogeneity of Markov chains July 6, 2009 5 / 39

Introduction and Notation Introduction and Notation Let p (t) ij = P(X t+1 = j X t = i) denote the transition probability from time t to t + 1. Let {π i } denote the initial distribution of X 1. The probability of a path ω = (s 1, s 2,...,s T ) is written as p(ω) = π s1 p 1 s 1 s 2...p T 1 s T 1 s T. (1) We consider the null hypothesis of homogeneity: H : p (t) ij = p ij, t = 1,...,T 1. (2) Takemura and Hara (Univ. of Tokyo) Testing homogeneity of Markov chains July 6, 2009 6 / 39

Introduction and Notation Introduction and Notation Then the probability of the path is p(ω) = π s1 p s1 s 2...p st 1 s T, which is a toric model (i.e. the right hand side is a monomial). So we can test H by a Markov basis approach. The parameters of the model under H is the set of transition probabilities {p ij } i,j S and the initial distribution {π i } i S. Takemura and Hara (Univ. of Tokyo) Testing homogeneity of Markov chains July 6, 2009 7 / 39

Introduction and Notation Introduction and Notation Suppose that we observe a (multi)set of N paths W = {ω 1,...,ω N } of the Markov chain. 1 1 2... T 2. S Takemura and Hara (Univ. of Tokyo) Testing homogeneity of Markov chains July 6, 2009 8 / 39

Introduction and Notation Introduction and Notation We identify the set of paths W with a T-way contingency table x = {x(ω), ω S T }, where x(ω) denotes the frequency of the path ω in W. We are going to count the number of transitions in W. x t ij : the number of transitions from s t = i to s t+1 = j in W. x t i : the frequency of the state s t = i in W. In particular x 1 i is the frequency of the initial state s 1 = i. The set of the numbers of transitions {xij t }, i, j S, t = 1,...,T 1 forms a sufficient statistic for non-homogeneous model (conditional independence model) p(ω) = π s1 p 1 s 1 s 2...p T 1 s T 1 s T. Takemura and Hara (Univ. of Tokyo) Testing homogeneity of Markov chains July 6, 2009 9 / 39

Introduction and Notation Introduction and Notation Let x + ij = T 1 denote the total number of transitions from i to j in W (ignoring the time t). t=1 The sufficient statistic under H is given by b = b(x) = {xs 1, s S} {x + ij, i, j S}. x t ij Fiber: F b = {x N ST b(x) = b}. Takemura and Hara (Univ. of Tokyo) Testing homogeneity of Markov chains July 6, 2009 10 / 39

Introduction and Notation Introduction and Notation Example of elements of a fiber: 2 2 We want to obtain a set of moves for transforming a set of paths to another set in the same fiber: Markov basis. Takemura and Hara (Univ. of Tokyo) Testing homogeneity of Markov chains July 6, 2009 11 / 39

Properties of the configuration and the toric ideal Properties of the configuration and the toric ideal homogeneous Markov chain toric ideal (HMC toric ideal) Takemura and Hara (Univ. of Tokyo) Testing homogeneity of Markov chains July 6, 2009 12 / 39

Properties of the configuration and the toric ideal Properties of HMC toric ideal For illustration we write out the configuration A for S = 2 and T = 4. 1111111211211122121112121221122221112112212121222211221222212222 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 11 3 2 1 1 1 0 0 0 2 1 0 0 1 0 0 0 12 0 1 1 1 1 2 1 1 0 1 1 1 0 1 0 0 21 0 0 1 0 1 1 1 0 1 1 2 1 1 1 1 0 22 0 0 0 1 0 0 1 2 0 0 0 1 1 1 2 3 The columns 1121 and 1211 are identical! Takemura and Hara (Univ. of Tokyo) Testing homogeneity of Markov chains July 6, 2009 13 / 39

Properties of the configuration and the toric ideal Properties of HMC toric ideal 1 2 1121 1 2 1211 (alternative figure: ) We need to consider degree 1 moves. We could choose just one column from A among identical columns. But it is not clear which column to choose. Takemura and Hara (Univ. of Tokyo) Testing homogeneity of Markov chains July 6, 2009 14 / 39

Properties of the configuration and the toric ideal Properties of HMC toric ideal Crossing path swapping t i Data with crossing path swappings look the same: corresponds either to {111, 212} or to {112, 211}. Takemura and Hara (Univ. of Tokyo) Testing homogeneity of Markov chains July 6, 2009 15 / 39

Properties of the configuration and the toric ideal Properties of HMC toric ideal FACT: crossing path swappings are the square-free degree two moves for the non-homogeneous model (linearly ordered conditional independence model): (π s1 can be absorbed into p 1 s 1 s 2.) Proposition 1 (Dobra) p(ω) = π s1 p 1 s 1 s 2...p T 1 s T 1 s T. The set of crossing path swappings constitutes a Markov basis for the linearly ordered conditional independence model. (Roughly) we will be identifying all the data sets with the same picture. Takemura and Hara (Univ. of Tokyo) Testing homogeneity of Markov chains July 6, 2009 16 / 39

Properties of the configuration and the toric ideal Properties of HMC toric ideal Example of a move: 2 by 2 swap t t + 1 t t + 1 i 1 i 1 j 1 j 1 (3) i 2 i 2 j 2 j 2 (Roughly again) Note that this move corresponds to a collection of moves, because in the picture the other parts of the paths are not specified. Takemura and Hara (Univ. of Tokyo) Testing homogeneity of Markov chains July 6, 2009 17 / 39

MB for two-state case with arbitrary length Markov basis for two-state case (arbitrary length) Takemura and Hara (Univ. of Tokyo) Testing homogeneity of Markov chains July 6, 2009 18 / 39

MB for two-state case with arbitrary length MB for two-state case Theorem 1 A Markov basis for S = 2, T 4, consists of the following moves. 1 cross path swappings, 2 degree one moves, 3 2 by 2 swaps of the following form:, 4 moves of the following form (the case of 2 included). Takemura and Hara (Univ. of Tokyo) Testing homogeneity of Markov chains July 6, 2009 19 / 39

MB for two-state case with arbitrary length MB for two-state case For S = 2, the complexity of Markov basis is the same for arbitrary T. (Looks somewhat like Markov complexity result.) Actually for T = 3, we do not need the fourth type. Moves are sums of at most two loops. Proof of the theorem is not too difficult, but not too easy. Takemura and Hara (Univ. of Tokyo) Testing homogeneity of Markov chains July 6, 2009 20 / 39

MB for arbitrary finite state space and length of three Markov basis for arbitrary finite state space and length of three Takemura and Hara (Univ. of Tokyo) Testing homogeneity of Markov chains July 6, 2009 21 / 39

MB for arbitrary finite state space and length of three MB for T = 3 Here we give an explicit form of a Markov basis for the case of T = 3. The number of states S = S is arbitrary. Let t :ij denote the transition from i to j at time t to t + 1. For T = 3 we only need to consider t = 1, 2. Let i 1,...,i m, m S, be distinct elements of S. Similarly let j 1,...,j m, m S, be distinct elements of S. Define a move (a permutation ) Z(i 1,...,i m ; j 1,...,j m ) by Z(i 1,...,i m ; j 1,...,j m ) : {1:i 1 j 1, 1:i 2 j 2,...,1:i m j m } {2:i 1 j m, 2:i 2 j 1,...,2:i m j m 1 }. (4) Takemura and Hara (Univ. of Tokyo) Testing homogeneity of Markov chains July 6, 2009 22 / 39

MB for arbitrary finite state space and length of three MB for T = 3 A typical move for S = 6 with m = 4, (i 1, i 2, i 3, i 4 ) = (1, 2, 4, 5), (j 1, j 2, j 3, j 4 ) = (1, 3, 5, 6). s\t 1 i 1 1 2 3 j 1 2 i 2 3 j 2 4 5 i 3 i 4 j 3 6 j 4 Takemura and Hara (Univ. of Tokyo) Testing homogeneity of Markov chains July 6, 2009 23 / 39

MB for arbitrary finite state space and length of three MB for T = 3 Theorem 2 A Markov basis for HMC toric ideal with T = 3 is given by the set of crossing path swappings and moves corresponding to m times m permutations Z(i 1,...,i m ; j 1,...,j m ) in (4), where m = 2,...,S, i 1,...,i m are distinct, and j 1,...,j m are distinct. Proof of this theorem is easy. Takemura and Hara (Univ. of Tokyo) Testing homogeneity of Markov chains July 6, 2009 24 / 39

MB for three-state, length of four Markov basis for three-state, length of four Takemura and Hara (Univ. of Tokyo) Testing homogeneity of Markov chains July 6, 2009 25 / 39

MB for three-state, length of four MB for S = 3, T = 4 We wanted to check the case S = 3, T = 4, assuming that it is not too complicated. However this case turned out to be incredibly difficult. We now have a theorem, but the proof at the moment exists in my hand-written memo of 150 pages with hundreds of pictures. To state our theorem we need some more definitions. Takemura and Hara (Univ. of Tokyo) Testing homogeneity of Markov chains July 6, 2009 26 / 39

MB for three-state, length of four MB for S = 3, T = 4 An extended simple loop Intuitively, it is a loop, such that when we are moving towards the future we follow positive edges (solid lines) and when we ave moving towards the past we follow negative edges (dotted lines). Also we require that each node is passed at most once. An important example of an extended simple loop for S = 3 and T = 4 is depicted as follows. Takemura and Hara (Univ. of Tokyo) Testing homogeneity of Markov chains July 6, 2009 27 / 39

MB for three-state, length of four MB for S = 3, T = 4 Sign-conformal sum of extended simple loops Two extended simple loops L 1, L 2 are sign-conformal if each edge t :ij belonging to both L 1 and L 2 has the same sign in two loops. A sign-conformal sum of two extended simple loops L 1, L 2 is a graph where each edge is weighted by a non-zero integer. The weight of an edge is given by it sign in L 1 and the number of L k s containing the edge. Example: sign-conformal sum of and is 2 Takemura and Hara (Univ. of Tokyo) Testing homogeneity of Markov chains July 6, 2009 28 / 39

MB for three-state, length of four MB for S = 3, T = 4 Now our summary result for 3 4 case is stated as follows. Theorem 3 A Markov basis for S = 3 and T = 4 is given by the set of crossing path swappings and the set of moves which are sign-conformal sums of at most three extended simple loops. Actually this statement is composed in hindsight. We are actually proving the following theorem: Theorem 4 A Markov basis for S = 3 and T = 4 consists of moves in the following list (16 types of moves). Takemura and Hara (Univ. of Tokyo) Testing homogeneity of Markov chains July 6, 2009 29 / 39

MB for three-state, length of four MB for S = 3, T = 4 0 [Type 0] Crossing path swapping. 1 [Type 1, Deg1 ] Degree one moves. 2 [Type 2, 2:2swap ] 2 by 2 swaps of (3). 3 [Type 3, 1*1swap ] Degree 2 move of partial path swapping with m = 1 in the following picture: t t. Takemura and Hara (Univ. of Tokyo) Testing homogeneity of Markov chains July 6, 2009 30 / 39

MB for three-state, length of four MB for S = 3, T = 4 4 [Type 4, 3:3 permutation ] Since S = 3, the 3 by 3 permutation is written as {t :1i 1, t :2i 2, t :3i 3 } {t :1j 1, t :2j 2, t :3j 3 } where 1 t < t 4 and both {i 1, i 2, i 3 } and {i 1, i 2, i 3 } are permutations of {1, 2, 3}. Further, for this move to be degree three, we require i 1 j 1, i 2 j 2 and i 3 j 3. An example of this move is 5 [Type 5, 2:2:2 swap ] Sum of three loops of length 4: Takemura and Hara (Univ. of Tokyo) Testing homogeneity of Markov chains July 6, 2009 31 / 39

MB for three-state, length of four MB for S = 3, T = 4 6 [Type 6, 2 by 2 swap of unequal length, 2:2 uneq ] A swap of partially specified partial paths for 3 4 case such as. 7 [Type 7, 2+3loop, sum of loops of length 4 and 6] There are three types, 7A, 7B, 7C: *. Takemura and Hara (Univ. of Tokyo) Testing homogeneity of Markov chains July 6, 2009 32 / 39

MB for three-state, length of four MB for S = 3, T = 4 8 [Type 8, 3loops, sum of three loops of length 4] = + + 9 [Type 9, 3loopsW, sum of three loops of length 4 with overlapped edges] 2 2 = + + Takemura and Hara (Univ. of Tokyo) Testing homogeneity of Markov chains July 6, 2009 33 / 39

MB for three-state, length of four MB for S = 3, T = 4 10 [Type 10, 2+3+3loopW3, sum of three loops of lengths 4, 6 and 6 with 3 overlapped edges] 2 2 2 = + + 11 [Type 11, 2+2+3loopW4, sum of three loos of lengths 4, 6 and 6 with 4 overlapped edges] 2 2 = + + 2 2 Takemura and Hara (Univ. of Tokyo) Testing homogeneity of Markov chains July 6, 2009 34 / 39

MB for three-state, length of four MB for S = 3, T = 4 12 [Type 12, 2+3+3loopW5, sum of three loops of lengths 4, 6 and 6 with 5 overlapped edges] 2 2 2 2 2 = + + 13 [Type 13, 3+3loop, sum of two loops of length 6] 2 = + Takemura and Hara (Univ. of Tokyo) Testing homogeneity of Markov chains July 6, 2009 35 / 39

MB for three-state, length of four MB for S = 3, T = 4 14 [Type 14, LL1, a long extended simple loop of length 8] 15 [Type 15, LL2, a long extended simple loop of length 10] Takemura and Hara (Univ. of Tokyo) Testing homogeneity of Markov chains July 6, 2009 36 / 39

Discussions and conjectures Discussions and conjectures Takemura and Hara (Univ. of Tokyo) Testing homogeneity of Markov chains July 6, 2009 37 / 39

Discussions and conjectures Discussions and conjectures From the case of S = 2, we expect that there exists some kind of Markov complexity, i.e., the complexity of MB does not depend on T. (Bold) Conjecture: For general S, MB consists of at most S sign-conformal sums of extended simple loops. Somehow, roughly speaking, the moves for pictures look very similar to toric ideals with respect to these pictures. Takemura and Hara (Univ. of Tokyo) Testing homogeneity of Markov chains July 6, 2009 38 / 39

Bibliography I Discussions and conjectures Akimichi Takemura and Hisayuki Hara (2009). Markov chain Monte Carlo test of homogeneity of Markov chains. In preparation. Takemura and Hara (Univ. of Tokyo) Testing homogeneity of Markov chains July 6, 2009 39 / 39