Samurai Sudoku-Based Space-Filling Designs

Samurai Sudoku-Based Space-Filling Designs Xu Xu and Peter Z. G. Qian Department of Statistics University of Wisconsin Madison, Madison, WI 53706 Abstract Samurai Sudoku is a popular variation of Sudoku. The game-board consists of five overlapping Sudoku grids, for each of which several entries are provided and the remaining entries must be filled subject to no row, column and three-by-three subsquare containing duplicate numbers. By exploiting these uniformity properties, we construct a new type of design, called a Samurai Sudoku-based space-filling design. Such a design is an orthogonal array based Latin hypercube design with several attractive properties: (1) the complete design achieves attractive uniformity in both univariate and bivariate margins; (2) it can be divided into groups of subdesigns with overlaps such that each subdesign achieves maximum uniformity in both univariate and bivariate margins; (3) each of the overlaps achieves maximum uniformity in both univariate and bivariate margins. Examples are given for illustrating the proposed designs. KEY WORDS: Computer experiments; Design of experiments; Orthogonal array based Latin hypercube desings; Data pooling. 1 Introduction The game-board of the now popular game Sudoku is a nine-by-nine grid for which the goal is to fill with numbers from one to nine. Several entries within the grid are provided 1

and the remaining entries must be filled in subject to each row, column, and three-by-three subsquare containing no duplicated numbers. Xu et al. (2011) proposed a class of Sudokubased space-filling designs from Sudoku puzzles. As Sudoku gains popularity, its variations become available. Samurai Sudoku (Puzzles, 2006) is one of the most popular variants. Figure 1 presents a completed Samurai Sudoku grid. It has five 9 9 grids which overlap at the corner regions in the shape of a quincunx. 2 3 4 5 6 7 8 9 1 5 6 7 8 9 1 2 3 4 5 6 7 8 9 1 2 3 4 8 9 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 2 3 4 5 6 7 8 9 1 3 4 5 6 7 8 9 1 2 6 7 8 9 1 2 3 4 5 6 7 8 9 1 2 3 4 5 9 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 3 4 5 6 7 8 9 1 2 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 1 2 3 2 3 4 5 6 7 8 9 1 5 6 7 8 9 1 2 3 4 8 9 1 2 3 4 5 6 7 6 7 8 9 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 1 2 3 4 5 7 8 9 1 2 3 4 5 6 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 1 2 3 7 8 9 1 2 3 4 5 6 8 9 1 2 3 4 5 6 7 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 1 5 6 7 8 9 1 2 3 4 5 6 7 8 9 1 2 3 4 8 9 1 2 3 4 5 6 7 Figure 1: A completed Samurai Sudoku grid. By exploiting the three types of uniformity of Sudoku Latin squares and the overlapping property of a Samurai Sudoku grid, we propose an approach to constructing a new type of design, called a Samurai Sudoku-based space-filling design. This approach consists of two steps. The first step generates a set of doubly orthogonal Samurai Sudoku grids. The second step uses this set of grids to generate a Samurai Sudoku-based space-filling design. Such a design has several attractive properties: (1) the complete design achieves attractive uniformity in both univariate and bivariate margins; (2) it can be divided into groups of 2

subdesigns with overlaps such that each subdesign achieves maximum uniformity in both univariate and bivariate margins; (3) each of the overlaps achieves maximum uniformity in both univariate and bivariate margins. A potential application of the proposed design is pooling data from multiple sources, where each subdesign is for one source, and the overlaps can be used for quantifying the differences among the sources. 2 Notation and Definitions A Latin square of order s is an s s square with entries from a set of s symbols such that in each row and column, every symbol appears precisely once. As in Pedersen and Vis (2009), for s 1 = s 2 2, define a Sudoku Latin square L of order s 1 to be a special Latin square consisting of s 2 s 2 subsquares in which each symbol appears precisely once in each subsquare. Define a Samurai Sudoku grid S of order s 1 to be a special grid consisting of Sudoku Latin squares of order s 1 with overlaps. For example, Figure 1 presents a Samurai Sudoku grid of order nine, which consists of one Sudoku Latin squares of order nine, overlapping with four Sudoku Latin squares of same order at 3 3 subsquares. For every prime p and every integer u 1, there exists a Galois field GF (p u ) of order p u. As in Xu et al. (2011), let s 1 = p 2u and s 2 = p u be powers of the same prime p for an integer u 1, with s 1 corresponding to the number of rows of a Sudoku Latin square of order s 1 and s 2 corresponding to the number of rows of its subsquares. Let F = GF (s 1 ) with a primitive polynomial p 1 (x), where α is a primitive element. Qian and Wu (2009) proposed the following subfield projection φ. Letting β = α (s 1 1)/(s 2 1), G = {0, β, β 2,..., β s2 1 } is the subfield of F, where β is a primitive element of G and β s2 1 = 1. Let β 0 = 0 and β i = β i, for i = 1,..., s 2 1. For u = 1, G becomes the residue field {0, 1,..., p 1} mod p. Then as F is a vector space over G with respect to the polynomial basis {1, α} (Lidl and Niederreiter, 1997, Chapter 3), represent any a F as a = b 0 + b 1 α (b 0, b 1 G) (1) 3

and define φ(a) = b 0 + b 1 β (b 0, b 1 G). (2) Lemma 1 captures some important properties of φ. Lemma 1 (Qian and Wu 2009). For the subfield projection φ, we have that (i) φ(b) = b, for any b G; (ii) φ(a 1 + a 2 ) = φ(a 1 ) + φ(a 2 ), for any a 1, a 2 F ; (iii) φ(ba) = bφ(a), for any b G and a F. For a F, Xu et al. (2011) defined a projection δ : F G as φ(a 1 ) (a 0) δ(a) = 0 (a = 0). (3) For a matrix A, A T denotes its transpose, A(:, j) denotes the jth column, A(i, :) denotes the ith row, A(i, j) denotes the (i, j)th entry, and a + A denotes the elementwise sum of A and a scalar a. For a matrix D with entries in F, δ(d) denotes the matrix obtained from D after the levels of its entries are collapsed according to δ. For any set B F, let δ(b) = {δ(a) : a B}. For any b G, define its preimage to be δ 1 (b) = {a F : δ(a) = b}. For D G, let δ 1 (D) = {a F : δ{a} D}. Define the kernel matrix Γ of δ to be an s 2 s 2 matrix with ith row Γ(i, :) = δ 1 (β i 1 ), (4) where the elements in each row are arranged in lexicographical order. This matrix has the following properties: (a) each element of F appears precisely once in Γ; (b) δ collapses the entries in the same row of Γ into a common element in G, i.e., for i = 1,..., s 2, δ{γ(i, j)} = β i 1 (j = 1,..., s 2 ); and (c) for j = 1,..., s 2, δ{γ(:, j)} = G. Example 1. Let p = 2 and u = 1. Use p 1 (x) = x 2 + x + 1 for F = GF (4) with α as a primitive element. Take G to be the subfield {0, 1} of F with β = 1. Here φ is: 4

{0, α + 1} 0, {1, α} 1, and δ is: {0, α} 0, {1, α + 1} 1, where the kernel matrix Γ of δ is 0 α. 1 α + 1 3 Construction of Doubly Othogonal Samurai Sudoku Grids We propose a method for constructing doubly orthogonal Samurai Sudoku grids, serving as a basis for the generation of Samurai Sudoku-based space-filling designs. The key here is the construction of a Samurai Sudoku grid from a Sudoku Latin square. Let s 1, s 2, F and G as defined in Section 2. A set of Sudoku Latin squares of order s 1 are doubly orthogonal if the whole squares are mutually orthogonal and for any given location, all s 2 s 2 subsquares are mutually orthogonal after level-collapsing according to the projection φ in (2). Definition 1. For s 1 = s 2 2, a set of Samurai Sudoku grids of order s 1 are doubly orthogonal if the center Sudoku Latin squares are doubly orthogonal and all other Sudoku Latin squares in the same location are orthogonal. For a F, a+g = {a+b b G} is an additive coset of G for which a is a representative. Since G is a subfield of F, F can be partitioned into s 2 additive cosets of G, P 1,..., P s2, where c i is a representative of P i and c 1 is the zero element of G (Lidl and Niederreiter, 1997, Chapter 1). Denote by A 0 the s 2 s 2 addition table of G. As in Pedersen and Vis (2009), let A ij = c i + c j + A 0, for i, j = 1,..., s 2. Partition the s 1 s 1 addition table A of F as A = A 11 A 1s2... A s2 1 A s2 s 2. (5) We first introduce the construction of doubly orthogonal Sudoku Latin squares proposed in Xu et al. (2011). For a F and the matrix A in (5), denote by r a its row whose first 5

entry is a. Denote by ξ i the ith entry of the first column of A. Using the kernel matrix Γ in (4), let γ l be an arbitrary element of Γ(l, :)\G (l = 1,..., s 2 ). (6) Obtain a Sudoku Latin square L(γ l ) with ith row r ξi γ 1. (7) l Let L (ij) (γ l ) denote the (i, j)th subsquare of L(γ l ). By properties (b) and (c) of Γ discussed in Section 2, δ(γ 1 ),..., δ(γ s2 ) constitute the subfield G. Put L = {L(γ 1 ),..., L(γ s2 )}, (8) where L(γ i ) is constructed by (7), for i = 1,..., s 2. Theorem 1 in Xu et al. (2011) guarantees that L in (8) is a set of doubly orthogonal Sudoku Latin squares. Example 2. Let p = 2 and u = 1. Use p 1 (x) = x 2 + x + 1 for F = GF (4) with α as a primitive element. Take G to be the subfield {0, 1} of F with β = 1. The projection φ from F to G was given in Example 1. The addition table of F is A = where the first row is denoted as r 0 and the third row is denoted as r α and so on. With γ 1 = α and γ 2 = α + 1, Two doubly orthogonal Sudoku Latin squares can be obtained: L(γ 1 ) =, L(γ 2 ) =, where φ{l (ij) (γ 1 )} and φ{l (ij) (γ 2 )} are mutually orthogonal, for i, j = 1, 2., 6

Now we generate a Samurai Sudoku grid S from a Sudoku Latin square L. The key here is to obtain Sudoku Latin squares L ij s by permuting the levels of L, such that the (1,1)th subsquare of L ij is identical to the (i, j)th subsquare of L. Let c i be the representative of the additive coset P i of G defined above. For γ {γ 1,..., γ s2 }, define L ij (γ) = c i γ 1 + c j + L(γ). (9) Put S(γ) = {L(γ), L ij (γ), i, j = 1,..., s 2 }. (10) Theorem 1 shows that S(γ) forms a Samurai Sudoku grid. From a set of doubly orthogonal Sudoku Latin squares {L(γ 1 ),..., L(γ s2 )} in (8), obtain a set of Samurai Sudoku grids {S(γ 1 ),..., S(γ s2 )}, where S(γ l ) is defined in (10). Theorem 1 shows that {S(γ 1 ),..., S(γ s2 )} is a set of doubly orthognal Samurai Sudoku grids. Theorem 1. For S l defined above, (i) S l forms a Samurai Sudoku grid of order s 1 ; (ii) {S(γ 1 ),..., S(γ s2 )} is a set of doubly orthogonal Samurai Sudoku grids; Proof. To prove (i), it suffices to show that for i, j = 1,..., s 2, one subsquare of L ij (γ l ) is identical to a subsquare of L(γ l ). The construction of the Sudoku Latin square in (7) implies that for i, j, l = 1,..., s 2, L (ij) (γ l ) = c i η 1 + c j + L (11) (γ l ), (11) where the right-hand side of (11) is the (1,1)th subsquare of L ij (γ l ) according to (9). This proves (i). Part (ii) comes from the doubly orthogonality of L γ1,..., L γs2. Example 3. Let p = 2 and u = 1. Use p 1 (x) = x 2 + x + 1 for F = GF (4) with α as a primitive element. Take G to be the subfield {0, 1} of F with β = 1. The projection φ from F to G was given in Example 1. For L(γ 1 ) and L(γ 2 ) given in Example 2, Theorem 1 7

gives two doubly orthogonal Samurai Sudoku grids S 1 S 2 = {L(γ 2 ), L ij (γ 2 ), i, j = 1, 2}, where = {L(γ 1 ), L ij (γ 1 ), i, j = 1, 2} and L 11 (γ 1 ) = L 12 (γ 1 ) = L 11 (γ 2 ) = L 12 (γ 2 ) =, L 21 (γ 1 ) =, L 22 (γ 1 ) =, L 21 (γ 2 ) =, L 22 (γ 2 ) = 4 Generation of Samurai Sudoku-Based Space-Filling Designs We present a two-step procedure for using the set of doubly orthogonal Samurai Sudoku grids {S(γ 1 ),..., S(γ s2 )} in Theorem 1 to generate a Samurai Sudoku-based space-filling design. Recall that s 1 = s 2 2 in Section 2. The first step uses {S(γ 1 ),..., S(γ s2 )} to obtain an orthogonal array B, which can be divided into submatrices with overlaps, each of which is an orthogonal array after the level-collapsing according to δ in (2). An orthogonal array OA(n, d, s, t) is an n d matrix with entries from a set of s symbols such that for every n t submatrix, all level combinations occur equally often (Wu and Hamada, 2009). The second step uses B to generate a Samurai Sudoku-based space-filling design via some elaborate level relabeling scheme. The key here is to relabel the entries in the submatrix B 0 of B first and then others. The first step constructs B as follows: 8,,,.

(i) For l = 1,..., s 2, convert L (ij) (γ l ) to a column vector ζ l of length s 1 by stacking the s 2 columns of the subsquare, for i, j = 1,..., s 2. (ii) Combine ζ 1,..., ζ s2 column by column to form an s 1 s 2 array B 0ij. (iii) Combine all B 0ij together row by row to form an s 2 1 s 2 array B 0. (iv) For i, j = 1,..., s 2, repeat steps (i), (ii) and (iii) for L ij (γ l ) to obtain an (s 2 1 s 1 ) s 2 array B ij after removing the rows corresponding to the (1,1)th subsquare of L ij (γ l ). (v) Combine all B 0 and B ij together row by row to form an s 3 1 s 2 array B. The double orthogonality of these Samurai Sudoku grids guarantees that B is an orthogonal array with special slicing structures as given in Theorm 2. Theorem 2. For B, B 0, B ij s and B 0ij s constructed above, we have that (i) the matrix B is an OA(s 3 1, s 2, s 1, 2); (ii) the submatrices B 0ij form a partition of B 0 and each φ{b 0ij } is an OA(s 1, s 2, s 2, 2); (iii) each B 0ij is a Latin hypercube of s 1 levels; (iv) both the matrix B 0 and the combined matrix of B ij and B 0ij are OA(s 2 1, s 2, s 1, 2). Example 4. Consider the doubly orthogonal Samurai Sudoku Latin squares, S 1 and S 2 in Example 3. For l = 1, 2, convert L (ij) (γ l ) to a column vector ζ l of length 4 by stacking the 2 columns of the subsquare, for i, j = 1, 2; combine ζ 1, ζ 2 column by column to form an 4 2 array B 0ij ; and combine all B 0ij together row by row to form an 16 2 array B 0 ; for i, j = 1, 2, repeat the above steps for L ij (γ l ) to obtain an 12 2 array B ij after removing the rows corresponding to the (1,1)th subsquare of L ij (γ l ); combine all B 0 and B ij together row by row to form an 64 2 array B given in Table 1. The second step permutes the levels of B to produce a Samurai Sudoku-based spacefilling design D. First, relabel the s 1 levels of B as 1,..., s 1, such that the group of levels of GF (s 1 ) that are mapped to the same level according to φ form a consecutive subset of 9

α 1 α 1 Run # 1 2 Source Run # 1 2 Source 1 0 0 33 + 1 2 α + 1 α 34 0 α+1 B 011 3 1 1 35 α 0 4 α α + 1 36 1 α 5 1 α + 1 37 α α B 21 6 α 1 38 1 0 B 021 7 0 α 39 α + 1 α+1 8 α + 1 0 40 0 1 9 α α 41 + 1 10 1 0 42 0 α+1 B 012 11 α + 1 α + 1 43 α 0 12 0 1 44 1 α 13 α + 1 1 45 0 0 14 0 α + 1 46 α + 1 α B 022 15 α 0 47 1 1 B 12 16 1 α 48 α α+1 17 1 α + 1 49 1 α+1 18 α 1 50 α 1 19 0 α 51 0 α 20 α + 1 0 52 α + 1 0 21 α α 53 α α 22 1 0 54 1 0 B 11 23 α + 1 α + 1 55 α + 1 α+1 24 0 1 56 0 1 25 α + 1 1 57 1 α+1 26 0 α + 1 58 α 1 27 α 0 59 0 α B 22 28 1 α 60 α + 1 0 29 0 0 61 0 0 30 α + 1 α 62 α + 1 α B 21 31 1 1 63 1 1 32 α α + 1 64 α α+1 Table 1: The orthogonal array B in Example 4 {1,..., s 1 }. Label the s 2 groups as groups 1,..., s 2 and the s 2 levels within the ith group as (i 1)s 2 + 1,..., (i 1)s 2 + s 2, for i = 1,..., s 2. As in Tang (1993), in each column of B 0, 10

replace the s 1 positions with entry k by a random permutation of (k 1)s 1 +1,..., (k 1)s 1 + s 1, for k = 1,..., s 1, to generate a Latin hypercube A. In each column of B ij, generate an array A ij by replacing the s 1 1 positions with entry k by a random sample with size s 1 1 of (k 1)s 1 + 1,..., (k 1)s 1 + s 1, such that it together with the entries of the corresponding column of A 0ij form a permutation of (k 1)s 1 + 1,..., (k 1)s 1 + s 1, for k = 1,..., s 1. Obtain an array A by combining A 0 and A ij s row by row. In each column of A, replace the s 1 positions with entry k by a random permutation of (k 1)s 1 + 1,..., (k 1)s 1 + s 1, for k = 1,..., s 2 1, to generate a Latin hypercube C = (c ij ). Finally, D = (d ij ) is generated through d ij = c ij u ij, (12) s 3 1 where i = 1,..., s 3 1, j = 1,..., s 2, and the u ij are independent U(0, 1] random variables. Let D 0 be the subdesign of D corresponding to B 0, D ij corresponding to the combined matrix of B 0ij and B ij. Let D 0ij be the overlaps between D 0 and D ij, corresponding to B 0ij, for i, j = 1,..., s 2. As described in Proposition 1, D is a Samurai Sudoku-based space-filling design in which (1) the whole design achieves both one- and two-dimensional stratification; (2) the subdesigns D 0 and D ij achieve maximum uniformity in any one- or two-dimensional projection; (3) each overlap D 0ij between D 0 and D ij achieves maximum uniformity in any one- or two-dimensional projection. Proposition 1. Consider the D, D 0, D ij s and D 0ij s constructed above. Then we have (i) the design D achieves stratification in both one- and two-dimensions. (ii) the subdesign D 0 and D ij achieve maximum uniformity on s 1 s 1 grids in two dimensions, in addition to achieving maximum uniformity in one dimension; (iii) the subdesigns D 0 can be divided into slices D 0ij s, each of which is an overlap between D 0 and D ij and achieves maximum uniformity on s 2 s 2 grids in two dimensions, in addition to achieving maximum uniformity in one dimension. Remark 1. The design D may be used for pooling data from s 1 + 1 sources. The subdesign D 0 can be used for the benchmark source and D ij s for others. The data from the overlap 11

D 0ij may be used for quantifying the differences between the benchmark source with D 0 and that with D ij. These differences can then be used to adjust data from D ij s, which can be combined with the data from D 0 for further statistical analysis. Example 5. Consider the array B in Example 4. Divide the four levels into two groups: {0, α + 1}, and {1, α}. Label the levels of these two groups as: {0, α + 1} {4, 3}, {1, α} {1, 2}. Then B is used to produce a Samurai Sudoku-based space-filling design D of 64 runs, which can be divided into five 16-run subdesigns D 0 and D ij with overlaps, corresponding to B 0 and the combined matrix of B 0ij and B ij, respectively. The subdesigns D 0 and D ij overlaps at D 0ij of four runs. Figure 2 presents the bivariate projection of D, where in any one dimension, each of the 64 equally spaced intervals of (0,1] contains exactly one point and in any two dimensions, each of the 4 4 square bins contains precisely four point. Figure 3 presents the bivariate projection of the subdesign D 0, where in any one dimension, each of the 16 equally spaced intervals of (0,1] contains exactly one point and in any two dimensions, each of the 4 4 square bins contains precisely one point. Figure 4 presents the bivariate projection of one overlap D 011, where in any one dimension, each of the four equally spaced intervals of (0,1] contains exactly one point and in any two dimensions, each of the 2 2 square bins contains precisely one point. The design D can be used for pooling data from five sources. The subdesign D 0 can be used for the benchmark source and D ij s for others. The data from the overlap D 0ij can be used for quantifying the differences between the benchmark source and other sources. 5 Conclusions We have proposed a new type of design, called a Samurai Sudoku-based space-filling design. Such a design can be divided into subdesigns with overlaps, where the whole design, subdesigns and the overlaps all achieve attractive space-filling properties. It may be used for pooling data from multiple sources. The data from the overlaps may be used for quantifying and adjust the differences between sources. Other potential applications of the proposed 12

x1 x2 Figure 2: Bivariate projections of a Samurai Sudoku-based space-filling design D of 64 runs in Example 5. design include computer experiments with qualitative and quantitative factors and crossvalidation. Acknowledgements The authors thank Qiyi Jiang for useful discussions. References Lidl, R. and Niederreiter, H (1997), Finite Fields (Encyclopedia of Mathematics and Its Application) (2nd ed.), Cambridge: Cambridge University Press. Pedersen, R. M. and Vis, T. L. (2009), Sets of Mutually Orthogonal Sudoku Latin Squares, College Mathematics Journal, 40n3, 174 180. Puzzles, C. (2006), Sudoku Variants, New York: Sterling. 13

x1 x2 Figure 3: Bivariate projections of the subdesign D 0 of 16 runs of a Samurai Sudoku-based space-filling design of 64 runs in Example 5, which is divided into four slices, denoted by,, +, and, respectively. Qian, P. Z. G. and Wu, C. F. J. (2009), Sliced Space-Filling Designs, Biometrika, 96, 945 956. Tang, B. (1993), Orthogonal Array-Based Latin Hypercubes, Journal of the American Statistical Association, 88, 1392 1397. Wu, C. F. J. and Hamada, M. (2009), Experiments: Planning, Analysis, and Optimization (2nd ed.), New York: Wiley. Xu, X., Haaland, B., and Qian, P. Z. G (2011), Sudoku-Based Space-Filling Designs, Biometrika, 98, 711-720. 14

x1 x2 Figure 4: Bivariate projections of one overlap D 011 of four runs between D 0 and D 11 in a Samurai Sudoku-based space-filling design D of 64 runs in Example 5. 15