MATHEMATICAL ENGINEERING TECHNICAL REPORTS. Markov chain Monte Carlo tests for designed experiments

Size: px

Start display at page:

Download "MATHEMATICAL ENGINEERING TECHNICAL REPORTS. Markov chain Monte Carlo tests for designed experiments"

Job Simpson
6 years ago
Views:

1 MATHEMATICAL ENGINEERING TECHNICAL REPORTS Markov chain Monte Carlo tests for designed experiments Satoshi AOKI and Akimichi TAKEMURA METR November 2006 DEPARTMENT OF MATHEMATICAL INFORMATICS GRADUATE SCHOOL OF INFORMATION SCIENCE AND TECHNOLOGY THE UNIVERSITY OF TOKYO BUNKYO-KU, TOKYO , JAPAN WWW page:

2 The METR technical reports are published as a means to ensure timely dissemination of scholarly and technical work on a non-commercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author s copyright. These works may not be reposted without the explicit permission of the copyright holder.

3 Markov chain Monte Carlo tests for designed experiments Satoshi Aoki Department of Mathematics and Computer Science Kagoshima University and Akimichi Takemura Graduate School of Information Science and Technology University of Tokyo November, 2006 Abstract We consider conditional exact tests of factor effects in designed experiments for discrete response variables. Similarly to the analysis of contingency tables, a Markov chain Monte Carlo method can be used for performing exact tests, when large-sample approximations are poor and the enumeration of the conditional sample space is infeasible. For designed experiments with a single observation for each run, we formulate log-linear or logistic models and consider a connected Markov chain over an appropriate sample space. In particular, we investigate fractional factorial designs with 2 p q runs, noting correspondences to the models for 2 p q contingency tables. 1 Introduction Exact calculations of p values for statistical conditional tests arise mainly in the context of analyzing contingency tables. For example, Fisher s exact test is frequently used for evaluating the hypothesis that the row effect and column effect are independent in the 2 2 contingency tables. Fisher s exact test is generalized to I J contingency tables in [11]. Traditionally, statistical tests for contingency tables have relied heavily on large-sample approximations for sampling distribution of the test statistics. However, many works have shown that large-sample approximations can be very poor when the contingency table contains both small and large expected frequencies even when the sample size is large. See [12], for example. Moreover, coupled with rapid development both in computer power and in techniques of algorithms, exact calculations of p values become feasible in various 1

4 settings for practical use. Consequently, for many types of problems where some ingenious calculation schemes are invented, it is unnecessary to use large-sample approximations for sampling distributions nowadays when their adequacy is in doubt. A typical example is the network algorithm by [18] for calculating exact p values of Freeman-Halton tests in two-way contingency tables. See the survey paper by [2]. At the same time simulation techniques for estimating p values by Monte Carlo procedures have also developed. In particular, for the problems where a closed form expression of the sampling distribution can not be obtained, Monte Carlo method provide powerful tools. Note that, in contrast to the large-sample approximations, we can estimate p values in arbitrary accuracy, theoretically, by increasing simulation sizes. However, for many models, such as general hierarchical log-linear models in multi-way contingency tables, direct generation of random sample is not straightforward. In this case, Markov chain Monte Carlo techniques can be used. For performing Markov chain Monte Carlo methods for sampling from discrete sample space, an important problem is how to construct a connected Markov chain on the given sample space. Note that, if an arbitrary connected Markov chain is constructed, the chain can be modified to give a connected and aperiodic Markov chain with stationary distribution being the desired null distribution by the usual Metropolis procedure ([14], for example). As for this point, the first breakthrough work is given by [8]. The key notion of [8] is a Markov basis, which enables to construct a connected Markov chain for arbitrary observed data set. [8] presented a general algorithm for computing a Markov basis in the settings of a general discrete exponential family of distribution. Their approach relies on the existence of a Gröbner basis of a well specified polynomial ideal. After [8], the techniques of the Markov chain Monte Carlo method for sampling from discrete conditional distributions are rapidly developed in the past decade. See for example [9], [10] and the works by Aoki and Takemura ([4], [3], [5], [6], [23], [24]). In this paper, we consider conditional exact tests in the context of designed experiments with counts (or ratios of counts) observations. In most of the classical literatures on designed experiments, the responses are assumed to be normally distributed. However, in many practical situations, the experimental data are not normally distributed. For such non-normal data, the generalized linear models are frequently used. See [13] or Chapter 13 of [25], for example. In these literatures, however, exact testing procedures for non-normal data are not considered. Since the experimental design is used when the cost of obtaining the data is relatively high, it is very important to develop techniques of exact procedures for the case of non-normal responses. Therefore in this manuscript, we consider the exact testing procedures for non-normal responses, based on the theory of the generalized linear models. For discrete responses, the above background and strategies also apply, i.e., to calculate p values for conditional tests, 1. traditionally, large-sample approximations such as the normal distribution or chisquare distribution are used, 2. if the observed data set contains both small and large expected values, the adequacy of the approximation becomes poor, 2

5 3. if the sample space is of moderate size, or some ingenious algorithms can be used, an exact calculation of p values is possible, 4. if an exact calculation is not feasible, we can rely on Monte Carlo procedure, 5. if a closed form expression of the null distribution is not given, Markov chain Monte Carlo procedure can be employed, if a Markov basis is available. The topic we consider in this paper is No. 5 of the above list. In Section 2, we formulate conditional exact tests of factor effects for fractional factorial designs. For designed experiments with a single observation for each run, we formulate log-linear or logistic models and consider how to construct a null model to be tested using the theory of generalized linear models. In Section 3, we consider Markov chain Monte Carlo tests for designed experiments. First we give a definition of Markov bases and a simple algorithm for evaluating p values by the Markov chain Monte Carlo tests in Section 3.1. In Section 3.2, we consider correspondences between the fractional factorial designs with 2 p q runs and models for 2 p q contingency tables. We end the paper with some discussions in Section 4. 2 Conditional tests for fractional factorial designs In this section, we consider the exact conditional tests for the discrete observations derived from fractional factorial designs. We investigate the designs with a single observation for each run, which is either a count or a ratio of counts. For the former case we consider the log-linear models, and for the latter case we consider the logistic models. Since our arguments for the two cases are almost the same, we first explain in detail the log-linear case in Section 2.1, and then give only a short description and a remark for the logistic case in Section Exact conditional tests for log-linear models of Poisson observations First we investigate the case that the observations are counts of some events. In this case, it is natural to consider Poisson models. To clarify the procedures of exact tests, we take a close look at an example of fractional factorial design with counts observations. Table 1 is a 1/8 fraction of a full factorial design (i.e., a fractional factorial design) defined from the aliasing relation ABDE = ACDF = BCDG = I, (1) and response data analyzed in [7] and reanalyzed in [13]. In Table 1, the observation y is the number of defects arising in a wave-soldering process in attaching components to an electronic circuit card. In Chapter 7 of [7], he considered seven factors of a wavesoldering process: (A) prebake condition, (B) flux density, (C) conveyer speed, (D) preheat condition, (E) cooling time, (F) ultrasonic solder agitator and (G) solder temperature, 3

6 Table 1: Design and number of defects y for the wave-solder experiment Factor y Run A B C D E F G each at two levels with three boards from each run being assessed for defects. The aim of this experiment is to decide which levels for each factors are desirable to reduce solder defects. In this paper, we only consider designs with a single observation for each run. Therefore, in this example, we focus on the totals for each run in Table 1. This is natural for the settings of Poisson models, since the set of the totals for each run is the sufficient statistics for the parameters. We also ignore the second observation in run 11, which is an obvious outlier as pointed out in [13]. Therefore the weighted total of run 11 is ( ) 3/2 = By replacing 2 by 1 in Table 1, we rewrite k p design 4

7 matrix as D, where each element is +1 or 1. Consequently, we have D = , y = Write y = (y 1,..., y k ) and D = (d ij ) = (d 1,..., d p ) where d j = (d 1j,..., d kj ) { 1, +1} k is the j-th column vector of D. k is the number of runs. If there are q aliasing relations defining this design, k = 2 p q holds (p = 7, q = 3 for this example). We define d st and d stu, 1 s < t < u p, as and d st = (d 1s d 1t,..., d ks d kt ) d stu = (d 1s d 1t d 1u,..., d ks d kt d ku ) for later use. The statistical model for this type of data is constructed from the theory of generalized linear models ([17]). Assume that the observations y i are mutually independently distributed with µ i = E(y i ), i = 1,..., k. The mean parameter µ i is expressed as g(µ i ) = β 0 + β 1 x i1 + + β ν x iν, where g( ) is the link function and x i1,..., x iν are the ν covariates defined below. The sufficient statistic is written as k i=1 x ijy i. The canonical link for the Poisson distribution is g(µ i ) = log µ i. Now we define covariates. We write the ν-dimensional parameter β and the covariate matrix X as β = (β 0, β 1,..., β ν 1 ) and X = 1 x 11 x 1ν x k1 x kν 1 = ( 1 k x 1 x ν 1 ), 5.

8 where 1 k = (1,..., 1) is the k-dimensional column vector. Since the likelihood function is written as ( k µ y i k ) ( i e µ i k ) y i=1 i! e µ i = exp y i log µ i y i=1 i! ( i=1 k ) ( ) e µ ν 1 i = exp β 0 1 k y i=1 i! y + β j x jy ( j=1 k ) e µ i = exp (β X y), y i! i=1 the sufficient statistic for β is X y = (1 k y, x 1y,..., x ν 1y). The matrix X is constructed from the design matrix D to reflect the effects of the factors and their interactions which we intend to measure. For example, a simple model which only includes the main effects for each factor is given as X = (1 k D), i.e., x j = d j for j = 1,..., ν 1 = p. On the other hand, we can consider a more complicated model containing various interaction effects, under the condition that it is consistent with the aliasing relations. In this example, the aliasing relation up to four-factor interactions is derived from (1) as follows. I = ABDE = ABFG = ACDF = ACEG = BCDG = BCEF = DEFG A = BDE = BFG = CDF = CEG, B = ADE = AFG = CDG = CEF C = ADF = AEG = BDG = BEF, D = ABE = ACF = BCG = EFG E = ABD = ACG = DFG, F = ABG = ACD = BCF = DEG G = ABF = ACE = BCD = DEF AB = DE = FG = ABDG = ACDG = ACEF = BCDF = BCEG AC = DF = EG = ABEF = BCDE = BCFG AD = BE = CF = ABCG = AEFG = BDFG = CDEG BC = DG = EF = ABDF = ABEG = ACDE = ACFG BD = AE = CG = ABCF = ADFG = BEFG = CDEF CD = AF = BG = ABCE = ADEG = BDEF = CEFG AG = BF = CE = ABCD = ADEF = BDEG = CDFG ABC = ADG = AEF = BDF = BEG = CDE = CFG Subject to the above aliasing relations, we may consider appropriate models where all the parameters are estimable. For example, the saturated model for this example includes 16(= k) parameters, when X is the Hadamard matrix of the order 16. One of the interpretations of the saturated model includes seven main effects, seven two-factor interaction effects, AB, AC, AD, AG, BC, BD, CD, and one three-factor interaction effect, ABC. We write this model as ABC/AD/BD/CD/AG/E/F by the manner of the hierarchical models. In this case, as in Table 2, the columns of X can be indexed as X = (1 k, d 1, d 2, d 12, d 3, d 13, d 23, d 123, d 4, d 14, d 24, d 5, d 34, d 6, d 7, d 17 ). 6

9 Table 2: Hadamard matrix of the order 16 and two interpretations I A B AB C AC BC ABC D AD BD E CD F G AG I A B AB C AC BC ABC D AD BD ABD CD ACD BCD ABCD Note that the interpretation of the models is not unique. An extreme interpretation of the saturated model is given by ignoring three main effects, E, F, G, and considering all the higher interaction effects among A, B, C, D, which is written as ABCD. This interpretation is not realistic in the context of designed experiments, but it is useful for understanding Markov bases in terms of contingency tables. These two interpretations of the saturated model are shown in Table 2. Note that the concept of identifiable model in the saturated model is also given in [19]. Using the theory of Gröbner basis, [19] gives a method to obtain one of the maximal identifiable models for a given design. See also [22] and [20]. Since the saturated model cannot be tested, we consider an appropriate submodel of the saturated model. For the purpose of illustration we focus on the model considered in [13], which is given as AC/BD/E/F/G, (2) i.e., the model of seven main effects and two two-factor interactions. We treat this model as the null model and consider significance tests. In this case, the null hypothesis can be described as follows. Permuting the columns of Table 2, we partition the covariate matrix X of the saturated model as X = (X 0 X 1 ), X 0 = (1 k, d 1, d 2, d 3, d 4, d 5, d 6, d 7, d 13, d 24 ) = (1 k, x 1,..., x ν 1 ), X 1 = (d 12, d 23, d 123, d 14, d 34, d 17 ) = (x ν,..., x k 1 ), and consider the corresponding parameter β = (β 0, β 1,..., β k 1 ). Then the submodel is specified in the form of a null hypothesis H 0 : β ν = = β k 1 = 0. 7

10 Under H 0, the nuisance parameters are β 0,..., β ν 1 and the sufficient statistic for the nuisance parameters is X 0y. Then the conditional distribution of y given the sufficient statistics is written as f(y X 0y = X 0y o ) = C(X 0y o ) k i=1 1 y i!, (3) where C(X 0y o ) is the normalizing constant determined from X 0y o and written as C(X 0y o ) 1 = ( k ) 1, (4) y i! and y F(X 0 yo ) F(X 0y o ) = {y X 0y = X 0y o, y i is a nonnegative integer for i = 1,..., k}. (5) Note that by sufficiency the conditional distribution does not depend on the values of the nuisance parameters. We can now consider significance tests against various alternatives to H 0. An important alternative is to test the effect of a single additional effect. For example in the above example we can test presence of AB-interaction effect by considering the alternative hypothesis H 1 : β ν 0. Or if we are interested in the goodness-of-fit test, then the alternative hypothesis is H 1 : (β ν,..., β k 1 ) (0,..., 0). Depending on the alternative hypothesis, we can use appropriate test statistic T (y), such as the likelihood ratio statistic for testing H 0 against H 1. In Section 3 we give a procedure to sample from the conditional distribution (3). Therefore we can assess the conditional distribution of any test statistic T (y) under H 0. For the purpose of illustration we now consider goodness-of-fit procedures. Traditional χ 2 tests evaluate the upper probability for some discrepancy measures such as the deviance, the likelihood ratio or Pearson goodness-of-fit, based on the asymptotic distribution, χ 2 k ν. For example, the likelihood ratio statistic T (y) = G 2 (y) = 2 k i=1 i=1 y i log y i ˆµ i (6) is frequently used, where ˆµ i is the maximum likelihood estimate for µ i under the null model (i.e., fitted value), given by ˆµ = (64.53, 47.25,..., 51.42) for our example. Then for the observed data y o = (y1, o..., yk o), T (y o ) = G 2 (y o ) is calculated as T (y o ) = and the corresponding asymptotic p value is from the asymptotic distribution χ 2 6. This result tells us that the null hypothesis is highly 8

11 Table 3: Design and number of good parts y out of 1000 for the windshield molding slugging experiment Factor Run A B C D y significant and is rejected. Using the conditional distribution (3), the exact p value is written as p = f(y X 0y = X 0y o )1(T (y) T (y o )), where y F(X 0 yo ) 1(T (y) T (y o )) = { 1, if T (y) T (y o ), 0, otherwise. Unfortunately, however, an enumeration of all the elements in F(X 0y o ) and hence the calculation of the normalizing constant C(X 0y o ) is usually computationally infeasible for large sample space. Instead, we consider a Markov chain Monte Carlo method described in Section Exact conditional tests for logistic models of binomial observations. Next we consider the case that the observation for each run is a ratio of counts. Table 3 is a 1/2 fraction of a full factorial design (i.e., a fractional factorial design) defined from the relation ACD = I (7) and response data given by [16] and reanalyzed in [13]. In Table 3, the observation y is the number of good parts out of 1000 during the stamping process in manufacturing windshield modeling. The purpose of [16] is to decide the levels for four factors, (A) polyfilm thickness, (B) oil mixture, (C) gloves and (D) metal blanks, which most improve the slugging condition. Similarly to Section 2.1, we rewrite this data as 9

12 D = , y = As for a statistical model for this type of data, it is natural to suppose that the distribution of the observation y i is the mutually independent binomial distribution Bin(µ i, n i ), i = 1,..., k, where n i = 1000, i = 1,..., k = 8 for this example. Following the theory of generalized linear models, we also consider the logit link, which is the canonical link for the binomial distribution. It expresses the relation between the mean parameter µ i and the systematic part as g(µ i ) = log µ i 1 µ i = β 0 + β 1 x 1i + + β ν x iν. The covariate matrix and the corresponding parameters are defined similarly as Section 2.1, i.e., permuting the columns of the Hadamard matrix of the order 8, we write X = (X 0 X 1 ) in such a way that X 0 is the covariate matrix for the appropriate null model. In this example, we consider the simple main-effects model, A/B/C/D, which is considered in [13]. For this model, the covariate matrix is written as X 0 = (1 k D). Similarly to Section 2.1, we consider the conditional tests for various alternatives to this model. In this case, since the likelihood function is written as k i=1 ( ni y i ) µ y i i (1 µ i) n i y i = = k ( ni y i i=1 k ( ni i=1 y i ) ( ) yi (1 µ i ) n µi i 1 µ i ) (1 µ i ) n i exp(β X 0y), the nuisance parameters under the null hypothesis are X 0y, n 1,..., n k. Consequently, the exact conditional tests are based on the conditional distribution, f(y X 0y = X 0y o, n 1,..., n k ) = C(X 0y o, n 1,..., n k ) k i=1 1 y i!(n i y i )!, (8) where C(X 0y o, n 1,..., n k ) is the normalizing constant determined from X 0y o, n 1,..., n k and written as ( k ) C(X 0y o, n 1,..., n k ) 1 1 =, (9) y i!(n i y i )! y F(X 0 yo,n 1,...,n k ) i=1 10

13 and F(X 0y o, n 1,..., n k ) = {y X 0y = X 0y o, y i {0, 1,..., n i }, i = 1,..., k}. (10) For notational convenience, we extend y to ỹ = (y 1,..., y k, n 1 y 1,..., n k y k ) for the binomial model. Corresponding to this ỹ, we also extend ν k matrix X 0 to ( ) X X 0 = 0 O ν,k, (11) I k where O ν,k is the ν k zero matrix and I k is the identity matrix of the order k. In the theory of the toric ideals, X 0 is called the Lawrence lifting of the configuration X 0. See [15] for details. Using ỹ and X 0, the condition that X 0 y and n 1,..., n k are fixed is simply written that X 0 ỹ is fixed. Hereafter for notational simplicity, we write y and X 0 instead of ỹ and X 0. Namely, to express the conditional distribution and its support for the binomial model, we use the expression (3)(4)(5), those for the Poisson model, instead of (8)(9)(10), respectively, 3 Markov chain Monte Carlo tests for the designed experiments In this section, we consider the Markov chain Monte Carlo methods for calculating p values defined in Section 2. In Section 3.1, we give an explanation of Markov chain Monte Carlo methods, along with the definition of Markov bases. We also describe some algorithms and softwares, which are useful in applications. In Section 3.2, we investigate the relation between the fractional factorial designs and contingency tables. 3.1 Markov chain Monte Carlo methods for evaluating p values To perform the exact tests defined in Section 2, a useful approach is to generate samples from the conditional distribution f(y X 0y = X 0y o ) and calculate p values for any test statistic. In particular, when the closed form expression of the null distribution can not be obtained, a Markov chain Monte Carlo approach is a valuable tool. If a connected Markov chain over F(X 0y o ) is constructed, the chain can be modified to give a connected and aperiodic Markov chain with stationary distribution f(y X 0y = X 0y o ) by the usual Metropolis procedure. To construct a connected chain, a frequently used approach is to prepare a Markov basis defined in [8]. Let M(X 0) be the set of integer vectors which are in the kernel of X 0, i.e., M(X 0) = {z = (z 1,..., z k ) X 0z = 0, z i is an integer for i = 1,..., k}, 11 I k

14 where 0 is the zero vector. We call the element in M(X 0) a move for X 0, in the sense that adding z M(X 0) to any y does not change the sufficient statistics, i.e., X 0(y + z) = X 0y. An important point is that y + z can include a negative element. On the other hand, if y + z F(X 0y), i.e., y + z is still a non-negative vector, we see that y is moved to y + z F(X 0y) by z. Now we give the definition of a Markov basis. Definition 3.1. A Markov basis for X 0 is a finite set of moves B = {z 1,..., z L }, z j M(X 0), j = 1,..., L, such that, for any y, y F(X 0y o ), there exists A > 0, (ε 1, z j1 ),..., (ε A, z ja ) with ε s { 1, +1}, z js B, s = 1,..., A, satisfying y = y + A ε s z js and y + s=1 a ε s z js F(X 0y o ) for a = 1,..., A. s=1 By definition, a Markov basis enables to construct a connected chain over F(X 0y o ), which is modified so as to have the null distribution f(y X 0y = X 0y o ) as the stationary distribution by the Metropolis-Hastings procedure. Therefore we can perform various conditional tests by the Monte Carlo sampling. We give a simple algorithm to calculate p values for some test statistic T ( ) based on the Markov chain Monte Carlo sampling. Input: Markov basis B, observed data y o, covariate matrix X 0, size of sample N, null distribution f( ), test statistic T ( ) Output: p value Variables: obs, count, sig, y, y next Step 1: obs = T (y o ). y = y o. count = 0. sig = 0. Step 2: Choose z from B randomly. I = {n y + nz F(X 0y o )}. Step 3: Select y next from {y + nz n I} with probability p n = f(y + nz) f(y + nz). n I Step 4: If T (y next ) obs then sig = sig + 1. Step 5: y = y next. count = count + 1. Step 6: If count < N then Go to Step 2. Step 7: Estimated p value is sig N Note that we need not calculate the normalizing constant, C(X 0y o ) in (4) or C(X 0y o, n 1,..., n k ) in (9), of the null distribution f( ), since it is canceled in the numerator and denominator in Step 3. Derivation of Markov bases is itself a very interesting problem. Markov bases can be very complicated and hard to compute for large models. Many works, including the 12

15 original work by [8], have relied on the theory of computational algebra and Gröbner bases. See [8], [9], [10]. On the other hand, a series of works by Aoki and Takemura investigates the structure of minimal Markov bases and gives some characterizations. In particular, Aoki and Takemura([4], [3], [5]) give the expression of the minimal Markov bases directly (i.e., not by using algebraic algorithm) for some problems of contingency tables. In applications, it is most convenient for readers to rely on algebraic computational packages such as 4ti2 ([1]). For example, consider the problem we have seen in Section 2.1. For the model (2), the covariate matrix X 0 is a matrix. To calculate the Markov basis for this X 0 by 4ti2, we only have to prepare a datafile and run the command markov. Then the list of a minimal Markov basis is instantly given as The above output shows that a minimal Markov bases for this X 0 consists of 35 moves, which corresponds to each row. Using a minimal Markov basis, we perform the likelihood ratio test based on (6) in Section 2.1. After 100, 000 burn-in steps, we construct 1, 000, 000 Monte Carlo samples. In contrast to the asymptotic p value , the estimated p value is 0.032, with estimated standard deviation , where we use a batching method to obtain an estimate of variance, see [14] and [21]. Figure 1 shows a histogram of the Monte Carlo sampling generated from the exact conditional distribution of the likelihood ratio statistic under the null hypothesis, along with the corresponding asymptotic distribution χ 2 6. Figure 1 shows that the asymptotic distribution understates the probability that the values of the test statistic for the samples is not less than the observed value, and overemphasizes the significance. 13

16 Figure 1: Asymptotic and Monte Carlo estimated exact distribution Table 4: Eight-run 2 p q fractional factorial designs (p q = 3) Number of factors p Resolution Design Generators 4 IV D = ABC 5 III D = AB, E = AC 6 III D = AB, E = AC, F = BC 7 III D = AB, E = AC, F = BC, G = ABC 3.2 Markov bases and corresponding models for 2 p q contingency tables In this section, we investigate relationships between contingency tables and fractional factorial designs with 2 p q runs. As noted in Section 1, Markov bases have been mainly investigated in the context of contingency tables. For example, [6] gives an expression of minimal Markov bases of all the hierarchical models for 2 4 contingency tables. This list may be sufficient in application for the analysis of 2 4 contingency tables, since the hierarchical model is the most natural class of models to be considered. We show in this section that, considering the fractional factorial designs, we encounter some new models and Markov bases, which do not correspond to hierarchical models of contingency tables. Fractional factorial designs with 8 runs. First, we consider fractional factorial designs with 8 runs, i.e., the case of p q = 3. The most frequently used designs are listed in Table 4. We clarify the relationships between these designs and the models of 2 3 contingency tables y = (y ijk ), 1 i, j, k 2, for the Poisson observations, and the models of 2 4 contingency tables y = (y ijkl ), 1 i, j, k, l 2, for the binomial observations. 14

17 In the case of Poisson observations, we write eight observations as if they are the frequencies of 2 3 contingency table, i.e., y = (y 111, y 112, y 121, y 122, y 211, y 212, y 221, y 222 ). In the case of p = 5, for example, the design and the observations are given as follows. Factor Run A B C D E y y y y y y y y y 222 For this type of data, we define ν-dimensional parameter β and the covariate matrix X according to an appropriate model we consider, as explained in Section 2. First consider the simple main effect model A/B/C/D/E (ν = 5). To test this model against various alternatives, Markov chain Monte Carlo testing procedure needs a Markov basis for the covariate matrix X 0 = Note that each row of the X 0y corresponds to the sufficient statistic under the null model A/B/C/D/E. In this case, the sufficient statistic is given as. y, y 1, y 2, y 1, y 2, y 1, y 2, y 11 + y 22, y 12 + y 21, y y 2 2, y y 2 1, (12) where we use the notations such as y = y ijk, y i = i=1 j=1 k=1 2 2 y ijk, y ij = j=1 k=1 2 y ijk. k=1 Here we see that the sufficient statistic (12) is nothing but a well-known sufficient statistic for the conditional independence model AB/AC, given as {y ij }, {y i k }, i, j, k = 1, 2. (13) 15

18 The one-to-one relation between (12) and (13) is easily shown as y ij = y i + y j (y ij + y i j ) 2, y i k = y i + y k (y i k + y i k), (14) 2 where {i, i }, {j, j } and {k, k } are distinct indices, respectively. This correspondence is, of course, due to the aliasing relation D = AB, E = AC. Next consider another model. Since there are eight observations, we can estimate eight parameters at most (in the saturated model). Since the saturated model cannot be tested, let us consider the models of seven parameters, i.e., case of ν = 6. If we restrict our attention to the hierarchical models, five main effects and one of the two-factor interaction effects, BC, BE, CD, DE, can be included in the models, since the aliasing relation is given as A = BD = CE, B = AD, C = AE, D = AB, E = AC, BC = DE, BE = CD = ABC. If our null model includes BC or DE, i.e., if our null model is written as A/BC/D/E or A/B/C/DE, we add the column ( ) to the covariate matrix X 0. In this case, the sufficient statistic under the null model includes y 11 + y 22 and y 12 + y 21 in addition to (12), which is nothing but a well-known sufficient statistic for the no three-factor interaction model, AB/AC/BC, {y ij }, {y i k }, {y jk }, i, j, k = 1, 2, by the similar relations to (14). On the other hand, if our null model includes BE or CD, i.e., if our null model is written as A/BE/C/D or A/B/CD/E, we have to add the column ( ) to the covariate matrix X 0. In this case, the sufficient statistic under the null model includes y y y y 221 and y y y y 222 in addition to (12). This is one of the models which do not have corresponding models in the hierarchical models of three-way contingency tables. We write this new model as AB/AC + (ABC). The sufficient statistic for this model is {y ij }, {y i k }, y y y y 221, y y y y 222, i, j, k = 1, 2. Similarly, we can specify the corresponding models of three-way contingency tables (for the factors A, B, C) to all the possible models for the designs of Table 4, as if the observations are the frequencies of 2 3 contingency tables. The result is summarized in Table 5. 16

19 Table 5: Eight-run 2 p q fractional factorial designs and the corresponding models of threeway contingency tables (p q = 3) Design : p = 4, D = ABC ν Null model Corresponding model of 2 3 table 4 A/B/C/D A/B/C + (ABC) a 5 AB/C/D AB/C + (ABC) a 6 AB/AC/D AB/AC + (ABC) a Design : p = 5, D = AB, E = AC ν Null model Corresponding model of 2 3 table 5 A/B/C/D/E AB/AC 6 A/BC/D/E AB/AC/BC A/BE/C/D AB/AC + (ABC) a Design : p = 6, D = AB, E = AC, F = BC ν Null model Corresponding model of 2 3 table 6 A/B/C/D/E/F AB/AC/BC a The sufficient statistic for (ABC) is y y y y 221, y y y y 222. In the case of Binomial observations, there are 16 observations. Similarly to the Poisson case, we treat the observations as if they are the frequencies of 2 4 contingency table. In the case of p = 5, for example, the design and the observations are given as follows. Factor Run A B C D E y y 1111 y y 1121 y y 1211 y y 1221 y y 2111 y y 2121 y y 2211 y y 2221 y 2222 For this type of data, we also specify parameter β and the covariate matrix according to the appropriate models, by replacing X by X of (11). Note that the elements of y is ordered as y = (y 1111, y 1121,..., y 2211, y 2221, y 1112, y 1122,..., y 2212, y 2222 ). Accordingly, correspondences to the models of 2 4 contingency tables are easily obtained and the result is given in Table 6. 17

20 Table 6: Eight-run 2 p q fractional factorial designs and the corresponding models of threeway contingency tables (p q = 3) Design : p = 4, D = ABC ν Null model Corresponding model of 2 4 table 4 A/B/C/D AD/BD/CD + (ABC) a + (ABCD) b 5 AB/C/D ABD/CD + (ABC) a + (ABCD) b 6 AB/AC/D ABD/ACD + (ABC) a + (ABCD) b Design : p = 5, D = AB, E = AC ν Null model Corresponding model of 2 4 table 5 A/B/C/D/E ABD/ACD + (ABC) a 6 A/BC/D/E ABD/ACD/BCD/ABC A/BE/C/D ABD/ACD + (ABC) a + (ABCD) b Design : p = 6, D = AB, E = AC, F = BC ν Null model Corresponding model of 2 4 table 6 A/B/C/D/E/F ABD/ACD/BCD/ABC a The sufficient statistic for (ABC) is {y ijk }, i, j, k = 1, 2. b The sufficient statistic for (ABCD) is y 111l + y 122l + y 212l + y 221l, y 112l + y 121l + y 211l + y 222l, l = 1, 2. Table 6 is automatically converted from Table 5 as follows. By definition, D is added to all the generating sets. Note also that the sufficient statistic for each model includes {y ijk }, 1 i, j, k 2, by definition, which yields Table 6. Therefore the models which do not include all of AB, AC and BC do not correspond to hierarchical models. Fractional factorial designs with 16 runs. Next we consider fractional factorial designs with 16 runs, i.e., the case of p q = 4. Table 7 is a list of sixteen-run 2 p q fractional factorial designs (p q = 4, p 10) from Section 4 of [25]. By the similar considerations to the 8 run cases, we can seek the corresponding models of 2 4 contingency tables for the Poisson observations, and models of 2 5 contingency tables for the Binomial observations. Since the modeling for Binomial observations can be easily obtained from the Poisson case as we have seen, we only consider the Poisson case here. Since at most sixteen parameters are estimable for the sixteen-run designs, we can consider various models of main effects and interaction effects. For example, the saturated model of the p = 5 design, E = ABCD, can include all the main and two-factor interactions, AB/AC/AD/AE/BC/BD/BE/CD/CE/DE. Note that for the models of p = 5, 6, 7, 8 in Table 7, each main effect and two-factor interaction is estimable. (On the other hand, for the resolution III models of p = 9, 10, some of the two-factor interactions are not estimable.) Among the models which include all the main effects and some of the two-factor interaction effects, some models have 18

21 Table 7: Sixteen-run 2 p q fractional factorial designs (p q = 4) Number of factors p Resolution Design Generators 5 V E = ABCD 6 IV E = ABC, F = ABD 7 IV E = ABC, F = ABD, G = ACD 8 IV E = ABC, F = ABD, G = ACD H = BCD 9 III E = ABC, F = ABD, G = ACD H = BCD, J = ABCD 10 III E = ABC, F = ABD, G = ACD H = BCD, J = ABCD, K = CD the corresponding hierarchical model in the 2 4 contingency tables if we write the sixteen observations as y = {y ijkl }, i, j, k, l = 1, 2. For example, for the p = 6 design of E = ABC, F = ABD, the model of 6 main effects and 5 two-factor interaction effects, AB/AC/AD/BC/BD/E/F, has a corresponding model of ABC/ABD in the 2 4 contingency tables. By the aliasing relation AB = CE = DF, AC = BE, AD = BF, AE = BC, AF = BD, CD = EF, CF = DE = ABCD, it is seen that there are = 48 distinct models such as AB/AC/AD/AE/AF/E/F, AB/AC/AD/AE/BD/E/F, AB/AC/AD/BC/AF/E/F, AB/AC/AD/BC/BD/E/F,. DF/BE/BF/BC/AF/E/F, DF/BE/BF/BC/BD/E/F, which correspond to the model of ABC/ABD in the 2 4 contingency tables. By the similar considerations, we can specify all the models for the designs of Table 7 which correspond some hierarchical models in the 2 4 contingency tables. The result is shown in Table 8. One of the merits to specify corresponding hierarchical models of contingency tables is a possibility to make use of general already known results on Markov bases of contingency tables. For example, [10] shows that a Markov basis can be constructed by the primitive moves, i.e., degree 2 moves, for the decomposable graphical models in the contingency tables. In our designed experiments, therefore, Markov basis for the models which correspond to decomposable graphical models of contingency tables can be constructed only 19

22 Table 8: Sixteen-run 2 p q fractional factorial designs and the corresponding hierarchical models of 2 4 contingency tables (p q = 4) Design : p = 6, E = ABC, F = ABD ν Representative Num. of the Corresponding hierarchical null model null models model of 2 4 table 11 AB/AC/AD/BC/BD/E/F 48 ABC/ABD 12 AB/AC/AD/BC/BD/CD/E/F 96 ABC/ABD/CD Design : p = 7, E = ABC, F = ABD, G = ACD ν Representative Num. of the Corresponding hierarchical null model null models model of 2 4 table 11 AB/AC/AD/BC/BD/CD/E/F/G 3 6 = 729 ABC/ABD/ACD Design : p = 8, E = ABC, F = ABD, G = ACD, H = BCD ν Representative Num. of the Corresponding hierarchical null model null models model of 2 4 table 11 AB/AC/AD/BC/BD/CD/E/F/G 4 6 = 4096 ABC/ABD/ACD/BCD by the primitive moves. Among the results of Table 5,6 and 8, there are two models which corresponds to decomposable graphical models in the contingency tables. We can confirm that minimal Markov bases for these models consist of primitive moves as follows. We use the following notational convention for a move z. Consider 2 3 case z = (z ijk ). If z i1 j 1 k 1 = z i2 j 2 k 2 = +1, z i3 j 3 k 3 = z i4 j 4 k 4 = +1, and other elements are zeros, then we denote z as (i 1 j 1 k 1 )(i 2 j 2 k 2 ) (i 3 j 3 k 3 )(i 4 j 4 k 4 ). Similar notation is used for 2 4 case fractional factorial design of D = AB, E = AC: The main effects model A/B/C/D/E corresponds to the decomposable graphical model AB/AC of the 2 3 contingency tables. This is a conditional independence model between B and C given A and a minimal Markov basis is constructed by primitive moves as (111)(122) (112)(121), (211)(222) (212)(221) fractional factorial design of E = ABC, F = ABD: The model AB/AC/AD/BC/BD/E/F corresponds to the decomposable graphical model ABC/ABD of the 2 4 contingency tables. This is a conditional independence model between C and D given {A, B} and a minimal Markov basis is again constructed by primitive moves as (1111)(1122) (1112)(1121), (1211)(1222) (1212)(1221), (2111)(2122) (2112)(2121), (2211)(2222) (2212)(2221). 20

23 For the other designs of Table 7 (p = 5, 9, 10), all the models include the sufficient statistic y y y y y y y y 2222, y y y y y y y y 2221, and therefore have no corresponding hierarchical models in the 2 4 contingency tables. For example, the sufficient statistic of the the main effect models for design of E = ABCD is {y i }, {y j }, {y k }, {y l }, i, j, k, l = 1, 2, y y y y y y y y 2222, y y y y y y y y 2221, and the sufficient statistic of the main effect models for design of E = ABC, F = ABD, G = ACD, H = BCD, J = ABCD, K = CD is {y ijk }, {y ij l }, {y i kl }, {y jkl }, i, j, k, l = 1, 2, y y y y y y y y 2222, y y y y y y y y Discussion In this paper, we consider Markov chain Monte Carlo tests for the factor effects in the designed experiments. As is noted in Section 1, Markov chain Monte Carlo procedure is a valuable tool when the adequacy of traditional large-sample tests is doubtful and the enumeration of the conditional sample space is infeasible. Since a closed form expression of the null distribution for the conditional tests considered in Section 2 is not available in general, Markov chain Monte Carlo procedure is valuable in the settings of this paper. Computational experience given in Section 3.1 shows efficacy of our method. To perform Markov chain Monte Carlo tests, it is often problematic to calculate a Markov basis. Current algorithms may take a very long time to compute Markov basis for 64 run or larger experiments. For the designs of 16 or 32 runs (for the Poisson models), a software such as 4ti2 works quite well and very practical. Nevertheless, the arguments and theoretical considerations in Section 3.2 seem important. One of the merits to specify the corresponding models of 2 p contingency tables is a possibility to make use of general results for the Markov bases of contingency tables as shown in Section 3.2. It is also important to consider more complicated designs and give appropriate Markov bases for them, such as designs with three levels or balanced incomplete block designs. The designed experiment is one of the areas in statistics where the applications of the theory of Gröbner basis are first considered. See the works [19], [22] and [20]. In these works, the design is represented as the variety for the set of polynomial equations. On the other hand, this manuscript gives another application of Gröbner basis theory to the designed experiments by considering Markov chain Monte Carlo tests for a discrete response variable. 21

24 References [1] 4ti2 team. 4ti2 a software package for algebraic, geometric and combinatorial problems on linear spaces. Available at [2] Alan Agresti. A survey of exact inference for contingency tables. Statist. Sci., 7(1): , With comments and a rejoinder by the author. [3] Satoshi Aoki and Akimichi Takemura. The list of indispensable moves of the unique minimal markov basis for 3 4 k and contingency tables with fixed two-dimensional marginals. METR , Submitted for publication. [4] Satoshi Aoki and Akimichi Takemura. Minimal basis for a connected Markov chain over 3 3 K contingency tables with fixed two-dimensional marginals. Aust. N. Z. J. Stat., 45(2): , [5] Satoshi Aoki and Akimichi Takemura. Markov chain Monte Carlo exact tests for incomplete two-way contingency tables. J. Stat. Comput. Simul., 75(10): , [6] Satoshi Aoki and Akimichi Takemura. Minimal invariant markov basis for sampling contingency tables with fixed marginals. Annals of the Institute of Statistical Mathematics, To appear. [7] L. W. Condra. Reliability Improvement with Design of Experiments. Marcel Dekker, New York, NY., [8] Persi Diaconis and Bernd Sturmfels. Algebraic algorithms for sampling from conditional distributions. Ann. Statist., 26(1): , [9] Ian H. Dinwoodie. The Diaconis-Sturmfels algorithm and rules of succession. Bernoulli, 4(3): , [10] Adrian Dobra. Markov bases for decomposable graphical models. Bernoulli, 9(6): , [11] G. H. Freeman and J. H. Halton. Note on an exact treatment of contingency, goodness of fit and other problems of significance. Biometrika, 38: , [12] Shelby J. Haberman. A warning on the use of chi-squared statistics with frequency tables with small expected cell counts. J. Amer. Statist. Assoc., 83(402): , [13] M. Hamada and J. A. Nelder. Generalized linear models for quality-improvement experiments. Journal of Quality Technology, 29: , [14] W. K. Hastings. Monte carlo sampling methods using markov chains and their applications. Biometrika, 57:97 109,

25 [15] Takayuki Hibi. Gröbner Bases. Asakura Shoten, Tokyo, (In Japanese). [16] B. Martin, D. Parker, and L. Zenick. Minimize slugging by optimizing controllable factors on topaz windshield molding. In Fifth Symposium on Taguchi Methods, pages American Supplier Institute, Inc., Dearborn, MI, [17] P. McCullagh and J. A. Nelder. Generalized Linear Models, 2nd ed. Chapman & Hall, London, [18] Cyrus R. Mehta and Nitin R. Patel. A network algorithm for performing Fisher s exact test in r c contingency tables. J. Amer. Statist. Assoc., 78(382): , [19] Giovanni Pistone and Henry P. Wynn. Generalised confounding with Gröbner bases. Biometrika, 83(3): , [20] Giovanni Pistonea, Eva Riccomagno, and Henry P. Wynn. Algebraic Statistics, Computational Commutative Algebra in Statistics. Chapman & Hall, [21] Brian D. Ripley. Stochastic Simulation. Wiley Series in Probability and Mathematical Statistics: Applied Probability and Statistics. John Wiley & Sons Inc., New York, [22] Lorenzo Robbiano and Maria Piera Rogantin. Full factorial designs and distracted fractions. In Gröbner bases and applications (Linz, 1998), volume 251 of London Math. Soc. Lecture Note Ser., pages Cambridge Univ. Press, Cambridge, [23] Akimichi Takemura and Satoshi Aoki. Some characterizations of minimal Markov basis for sampling from discrete conditional distributions. Ann. Inst. Statist. Math., 56(1):1 17, [24] Akimichi Takemura and Satoshi Aoki. Distance reducing markov bases for sampling from a discrete sample space. Bernoulli, 11(5): , [25] C. F. Jeff Wu and Michael Hamada. Experiments: Planning, analysis, and parameter design optimization. Wiley Series in Probability and Statistics: Texts and References Section. John Wiley & Sons Inc., New York, A Wiley-Interscience Publication. 23

MATHEMATICAL ENGINEERING TECHNICAL REPORTS. Markov Bases for Two-way Subtable Sum Problems

MATHEMATICAL ENGINEERING TECHNICAL REPORTS Markov Bases for Two-way Subtable Sum Problems Hisayuki HARA, Akimichi TAKEMURA and Ruriko YOSHIDA METR 27 49 August 27 DEPARTMENT OF MATHEMATICAL INFORMATICS