Ciphertext-only Cryptanalysis of a Substitution Permutation Network

Ciphertext-only Cryptanalysis of a Substitution Permutation Network No Author Given No Institute Given Abstract. We present the first ciphertext-only cryptanalytic attack against a substitution permutation network block cipher. Unlike many of the existing attack methods requiring huge amounts of data, our method requires two n-bit ciphertext blocks only encrypted with the same key, for an n-bit cipher. This method is a divide-and-conquer strategy that exploits the partitioning of data induced by parallel S-boxes in a conventional SPN design. The partitioning allows a reduction of the problem of ciphertextonly cryptanalysis to that of solving sets of simultaneous boolean equations over binary vector spaces of much smaller dimensionality than the block size, at the round level. These equations give rise to local solutions that are concatenated to form global solutions that satisfy some logical consistency conditions. From the global solutions, cipher states and subkeys are deduced using a search of complexity not more than O(2 2s ), with s the number of s-boxes. Our method exploits the fact that a binary representation of an S-box specifies all the information needed regarding its encoding in terms of its component nonlinear boolean functions.the attack is effective due to its modest space and time complexities relative to some well-known cryptanalytic attacks of SPN ciphers. Keywords: SPN, logical consistency, cipher state, completeness, ciphertext-only 1 Introduction... and many things which cannot be overcome when they are taken together, yield themselves up when taken little by little. Plutarch, c.ad 46-127. Current state-of-the-art in cryptanalysis simply fails to achieve the primary goal of being useful to its practitioners. At best its benefits have been minimal, and at worst it seems to have led to a fixation with ideas of dubious value. Unless cryptanalysis is practicable, we may as well do without it those things that we are currently doing with it. The goal of cryptanalysis is achieved when a cipher is broken. But when an attack is a certificational weakness of some cipher, that attack is not a proven solution. There has been an increase in the number of s-box design criteria among which can be mentioned balancedness, nonlinearity, completeness, non-degeneracy, strict avalanche criterion, and higher-order strict avalanche criterion, with possibly many others still in the making. At the same time we have seen more attacks of a differential and/or linear type such as higher-order differential, impossible differential, linear differential, truncated differentials, with possibly their derivatives still in the making. Indeed, as more s-box design criteria were being proposed, more derivatives of the linear or differential type seem to have appeared. Could this be an indication that there has been an overcommittment to particular ideas? While no slight improvement in computation time or memory is too insignificant, commitment must surely be to the goal (of breaking a cipher) and not to an approach. Overcommitment to ideas all to easily can lead to a narrowing of the view of what is possible and an increased sense of false security. With ideas getting rediscovered and reinvented, one thing never spoken of in cryptanalysis literature is an optimum solution. With required memories and computation times in the galactic range for many attacks based on the chosen plaintext or known plaintext cryptanalysis models, the question of whether a suggested cryptanalysis solution is optimum never seems to arise. For all we know, an attack that requires 2 80 data pairs and 2 65 computation steps might be the worst possible, or it could simply be an infeasible best case in some set of related approaches. When an attack model must make unrealistic assumptions, does that not imply the resulting impossible solution? As more attacks have appeared about which it is difficult to reason, should we not revisit the question: What is the best way to solve this

2 No Author Given (cryptanalysis) problem?, rather than throw amounts of data we do not have at the problem. While much that is insightful about block cipher design has been learned from the attacks based on the known and chosen plaintext cryptanalysis models, the most realistic model of all has become the most unimportant due to neglect. There has been an overemphasis on known and chosen plaintext attack models and yet, in reality, the adversary is likely to only have small amounts of ciphertext in his possession when he mounts the attack; or at least he is unlikely to acquire or produce 2 50 data pairs, say, and then assemble an intelligible message within the hard time constraints of businesses where information value depreciates with time. We are not aware of any attacks based on the chosen ciphertext model, but this model is no more practical than the other two. In this paper we will present a cryptanalysis approach that represents an optimum solution, or at least one that balances the ideals of theory with the reality of practice. Our approach makes a reasonable assumption of availability of two ciphertext blocks only to mount the attack against a substitution permutation network. It will be seen that the time and memory complexities of our attack is modest. The attack is of a non-statistical nature and takes advantage of a weakness inherent in the SPN s structural design. This structural weakness has two aspects to it: (i) a whitening operation following the last round 1 and, (ii) partitioning of n-bit cipher states into sub-blocks of size n/s, where s is the number of S-boxes. The first one allows setup of some logical consistency condition that does not involve the subkey bits, and the second one enables reduction of the ciphertext-only attack to a problem of solving systems of simultaneous nonlinear boolean equations whose local solutions exist in binary vector spaces of dimensionality much less than the block size. If, say, each of the s-boxes is a q q permutation, with q = n/s, then we are able to derive a system of q simultaneous nonlinear boolean equations in q unknowns for each s-box. The primary aim is to obtain intermediate cipher states, beginning with the last round and working backwards, by solving these systems of simultaneous non-linear equations using logical deduction. The biectivity and completeness properties of the s-boxes are useful here. Essentially our attack reduces an -round cipher to one-round ciphers and cryptanalyzes these individually and in sequence. Hence Plutarch. Our contribution. 1. We present a realistic, simple, efficient, and easy-to-reason-about ciphertext-only cryptanalytic attack against the substitution permutation network. 2. We demonstrate that the conventional SPN architecture is faulty as it facillitates a ciphertext-only attack that cannot be defended against by more s-box design criteria. Previous work. Chosen plaintext and known plaintext attacks against substitution permutation networks that satisfy the cryptographic property of completeness have appeared in the literature. O Connor [13] presents a differential cryptanalysis attack against a generic SPN with a completeness property for which the expected number of chosen plaintexts is proportional to the number of S-boxes. The attack that he outlines targets a variant of the SPNs where keyed S-boxes are used and he notes that the same attack is applicable for the SPN variant where a subkey mask is applied before the S-boxes. In [10] Millan et. al. restate the so-called Anderson attack, which is a chosen plaintext attack, as well as what they claim is an improvement thereof in terms of speed and efficiency. Since the boolean encodings of S-boxes are seldom known to the attacker and enumerating them is known to be a hard problem, since there are at least 2 2q [8] boolean functions that map GF(2 q ) to GF(2) for a q q S-box, this attack might actually be impractical in general. Heys and Tavares [11] present chosen plaintext and known plaintext attacks against SPNs with a completeness property. The former is based on the possibility, where it exists, of deriving information about S-boxes using a chosen set of plaintext and ciphertext pairs. The attack then exploits such information to construct a network that is functionally equivalent to the original network. The latter seeks to determine the correct target partial subkeys by performing trial encryptions using known plaintext-ciphertext pairs available to the attacker. The design of substitution permutation networks proposed in [1] has no impact on the effectiveness of the ciphertext-only attack that we are proposing. 1 Seen as a strength!

Ciphertext-only Cryptanalysis of a Substitution Permutation Network 3 Organization of the paper. Section 2 presents the preliminaries: notations and symbols that are used in the rest of the paper. Section 3 presents a brief description of a substitution permutation network and defines the problem that is to be solved. Section 4 outlines our ciphertext-only cryptanalysis approach. In section 5 we outline the approach we use to solve the boolean equations. In section 6 we give an outline of our algorithm and provide a verification of its correctness. Section 7 derives the time and memory complexities of our attack and illustate these complexities for a typical SPN. Section 8 concludes the paper. 2 Preliminaries This section, in the following table, introduces notations and symbols to be used consistently in the rest of the paper. n Block and subkey size s Total number of S-boxes per round. q Number of bits in the input and output of an S-box, i.e., q = n/s C 1 n-bit ciphertext block 1 C 2 n-bit ciphertext block 2 r round index K r r th subkey S i() i th S-box; S i : GF(2 q ) GF(2 q ) () th component boolean function of S-box S i; f (i) : GF(2 q ) GF(2), i.e., S i(x) = [f (i) 0 (X)f(i) 1 (X)... f(i) q 1 (X)]2, indexing from 0 a b b concatenated to a Xr 1 Intermediate data block 1 into round r Xr 2 Intermediate data block 2 into round r u th partition of Xr, i i.e., Xr i = Xr 1(i) Xr 2(i)... Xr s(i), i = 1, 2 Total number of rounds of a cipher A permutation that maps GF(2 n ) to itself P 1 n-bit plaintex block 1 P 2 n-bit plaintext block 2 f (i) X u(i) r 3 Overview 3.1 A description of the SPN We will consider a substitution permutation network of the kind shown in figure 1, i.e., one with fixed s-boxes. This network has a block size and a subkey size of n bits, and repeats identical operations in each round times. The round function consists of subkey mixing through an exclusive-or operation, substitution via s S-boxes, and a permutation of the bits positions. The networks employs + 1 subkeys,{k 1,K 2,...,K +1 }, assumed independently generated and unrelated, with the first subkey applied as a first operation of the first round and the last subkey applied after the last round using the self-invertible xor operator. Each of the s s-boxes is biective (one-to-one and onto) and the bit position permutation is invertible. Therefore with each of the component operations of a round invertible, the round itself is invertible. Figure 1 illustrates an encryption operation, but decryption is essentially the same with the key scheduling reversed and the mappings used for the s-boxes and bit-position permutation being the inverses of the mappings used in the encryption network. The absence of a permutation after the last round in the encryption network ensures the same structure for the decryption network.

4 No Author Given plaintext P (n bits) subkey k 1 XO S 1 S 2... S s permutation subkey k 2 XO S 1 S 2... S s permutation X 1 Y 1 Z 1 X 2 Y 2 Z 2.................. S 1 S 2... S s permutation subkey k XO S 1 S 2... S s subkey k +1 XO X -1 Y -1 Z -1 X Y ciphertext C (n bits) Fig. 1. A generic SPN 3.2 Problem definition Ciphertext-only cryptanalysis is an attack model in which the cryptanalyst possesses only the ciphertext, and aims to recover the key and/or the corresponding plaintext. For our attack the aim is to recover intermediate data blocks Xr 1 and Xr, 2 for r 1, corresponding to the two ciphertext blocks C 1 and C 2. Use of two ciphertext blocks allows us to exploit a design flaw which is actually thought to contribute to the security of an SPN. This flaw is the subkey exclusive-or operation after the last round. We exploit this flaw by setting up a logical consistency condition that relates the two ciphertext blocks to the corresponding intermediate data blocks Xr 1 and Xr 2 for r =. The problem is then to solve for the unknowns X 1 and X2. Once determined, the process is repeated for the preceeding rounds until we reach the first round. It will be shown later that recovering the subkeys is then a simple matter. In the figure, the quantities labeled X r, Y r, and Z r, are the outputs of the subkey addition operation, the S-boxes, and bit position permutation respectively, for round r. These are the variables that we will use to set up consistency conditions for different rounds, beginning with the last. Note that for an SPN of figure 1 a ciphertext-only exhaustive search has an infeasible best case of Θ(2 2n ) and a worst case of Θ(2 (+2)n ). 3 3 Θ(f(n)) is order exactly f(n)

4 Ciphertext-only cryptanalysis Ciphertext-only Cryptanalysis of a Substitution Permutation Network 5 Our attack exploits the following characteristics of an SPN architecture: 1. A subkey is applied following the last round; this enables a setup of the consistency condition using the information available to us, i.e., the two ciphertext blocks. 2. The s-box S is a nonlinear operation, i.e. S(X 1 X 2 ) S(X 1 ) S(X 2 ); this preserves the data parallelism that our attack exploits. 3. Within a round s-boxes are applied in parallel; this enables partitioning of data and and makes feasible parallel combinatorial searching on spaces of reduced dimensionality. 4. For each s-box each output bit is a function of all the input bits. From figure 1 the encryption round function e() has inputs Z and K r, and output Z r, related by Z r = e(k r,z ), where e(k r,z ) = (S(Z K r )), and S are the s-boxes; the last round does not include. This can also be written as e(k r,z ) = (S(X r )). ather than work with the round function, we observe that characteristic 1 above allows us to work with equations of the form X r = g(x,k r ), where g(x,k r ) = (S(X )) K r ; again with the last round different by the absence of. The difference between the two is that the former equation contains K r implicitly and the latter explicitly. It is equations of the latter kind that allow the problem of ciphertext-only cryptanalysis of an SPN to be reducible to that of solving s sets of simultaneous boolean equations, each in q unknowns. To get these systems of equations we derive consistency conditions for the three cases r = + 1, 2 r, and r = 1. In figure 1 X r is the n-bit cipher state that is divided into s sub-blocks each of size q in round r. Y r is the n-bit block formed by concatenated outputs of the s S-boxes which is, in turn, the input to the permutation. Z r is the output from the permutation in round r. We use a bottom-up approach starting with the last round for which we have known data C 1 and C 2, and require that C 1 C 2 0. 4.1 Case r = + 1 Consider two distinct n-bit ciphertext blocks C 1 and C 2 encrypted using the same key. For the subkey k the following equations hold: and C 1 = Y 1 k +1 = S 1 (X 1(1) ) S 2(X 2(1) )... S s(x s(1) ) k +1 (1) Combining (1) and (2) yields C 2 = Y 2 k +1 = S 1 (X 1(2) ) S 2(X 2(2) )... S s(x s(2) ) k +1 C 1 C 2 = S 1 (X 1(1) ) S 2(X 2(1) )... S s(x s(1) ) S 1(X 1(2) ) S 2(X 2(2) (2) )... S s(x s(2) ) (3) Let C = C 1 C 2. We will partition C into s sub-blocks of q bits each such that C = C 1 C 2... C s. Equation (3) can then be written as a set of s equations C 1 = S 1 (X 1(1) ) S 1(X 1(2) ) C 2 = S 2 (X 2(1) ) S 2(X 2(2) ). C s = S s (X s(1) ) S s(x s(2) ) (4)

6 No Author Given For each sub-block C i, i = 1 : s, let (c (i) 0 c(i) 1...c(i) q 1 ) 2 be the binary representation of C i. By considering the encoding of an S-box in terms of its boolean functions, S i (X) = [f (i) 0 (X)f(i) 1 (X)...f(i) q 1 (X)], each of the equations in (4) can be further divided into q boolean equations thus c (i) 0 = f (i) c (i) 1 = f (i) c (i) q 1 = f(i) 0 (Xi(1) 1 (Xi(1) q 1 (Xi(1) ) f(i) 0 (Xi(2) ) ) f(i) 1 (Xi(2) ). ) f(i) q 1 (Xi(2) ) (5) From (5) we have q equations in 2q unknowns at each s-box S i. While the multilinear representations of the boolean functions f (i) is unknown, in fact we do not need to know what they are. We will see that the binary representation of each s-box contains all the information that we need about these functions, even if we assume them to be maximally non-linear, balanced, complete, and so on. For each s-box S i we solve the system (5) to obtain a local solution which when concatenated with other local solutions produce global solutions X 1 and X2. There are 2s ways to form each global solution from its corresponding local solutions, and there are 2 2s ways in which these global solutions satisfy the consistency condition. 4.2 Case r 2 The solutions X 1 and X2 obtained in the previous section are used in this section as outputs of round 1. In general, for this case we will use solutions obtained for round r to derive the consistency condition for the unknowns of round r 1. Similarly to the previous case but noting that for this case the permutation must be taken into account, we derive the following equations and X 1 r = (S 1 (X 1(1) ) S 2(X 2(1) )... S s(x s(1) )) k r (6) X 2 r = (S 1 (X 1(2) ) S 2(X 2(2) )... S s(x s(2) )) k r (7) Combining (6) and (7) we obtain an equation where the subkey is factored out. 1 (X 1 r X 2 r) = S 1 (X 1(1) ) S 2(X 2(1) )... S s(x s(1) ) S 1(X 1(2) ) S 2(X 2(2) )... S s(x s(2) ) (8) where 1 is the inverse permutation. This is the consistency condition for this case. We make use of the global solutions obtained previously (X 1 and X2 ) by setting r = on the left-hand side of (8), and proceed in a similar manner until we reach r = 2. For this case, let X r = 1 (Xr 1 Xr), 2 and parition X r such that X r = X (r) 1 X(r) 2... X(r) s, where, 1 s, X (r) GF(2 q ). Using these partitions together with (8) we obtain a set of equations identical to (4) X (r) 1 = S 1 (X 1(1) ) S 1(X 1(2) ) X (r) 2 = S 2 (X 2(1) ) S 2(X 2(2) ). X (r) s = S s (X s(1) ) S s(x s(2) ) (9)

Ciphertext-only Cryptanalysis of a Substitution Permutation Network 7 Let (x (r,i) 0 x (r,i) 1...x (r,i) q 1 ) 2 be the binary representation of each partition X (r) i, where 1 i s. By expressing s-boxes in terms of their component boolean functions we get a set of equations identical to (5). Thus x (r,i) 0 = f (i) 0 (Xi(1) ) f(i) 0 (Xi(2) ) x (r,i) 1 = f (i) 1 (Xi(1) ) f(i) 1 (Xi(2) ). x (r,i) q 1 = f(i) q 1 (Xi(1) ) f(i) q 1 (Xi(2) ) (10) Here too we have q simultaneous equations in 2q unknowns. In the next section we will show how we proceed to solve these equations. 4.3 Case r = 1 Unlike the two previous cases, this case does not involve s-boxes or the permutation. The two equations derived here are P 1 X 1 1 = K 1 (11) and P 2 X 2 1 = K 1 (12) For convenience let P 1 and P 2 be represented as X 1 0 and X 2 0 respectively. The consistency condition for this case is then X 1(1) 1 X 1(2) 1 X 2(1) 1 X 2(2) 1... X s(1) 1 X s(2) 1 = X 1(1) 0 X 1(2) 0 X 2(1) 0 X 2(2) 0 X s(1) 0 X s(2) 0 (13) The left-hand side of (13) are solutions of (10) for r = 2, and each of the partitioned equations X i(1) 1 X i(2) 1 = X i(1) 0 X i(2) 0, 1 i s, can then be solved independently of all the others.these are equations over GF(2 q ), which is a much smaller space than GF(2 n ). Consequently an exhaustive search over the space of partitions that satisfy (13) is feasible. 5 Solving systems of equations Case + 1 r 2. A q q s-box is a binary matrix with 2 q rows and q columns where each column is the truth table of each of the component boolean functions with domain GF(2 q ). Since the s-boxes are given, we can solve equations (5) without the need to evaluate boolean functions f i ; equations (10) can be solved in a similar way. As noted, for each s-box S i we are solving systems of simultaneous equations u (i) = f (i) (X i(1) ) f(i) (X i(2) ), 0 q 1, 1 i s (14) where u (i) = c (i) for r = + 1 and u (i) = x (i) for r 2. Depending on the value of u (i), u(i) GF(2), we can use logical deduction to arrive at the possible values for each of the operands in the exclusive-or sum. This leads to two scenarios for each value of u (i) as follows: u (i) = { 0 implies f (i) 1 implies f (i) (X i(1) (X i(1) ) = 0 and f(i) ) = 1 and f(i) (X i(2) (X i(2) ) = 0 O f(i) ) = 0 O f(i) Equation (15) gives rise to four valid possibilities for each value of u (i) unknowns X i(1) and Xi(2) (X i(1) (X i(1) ) = 1 and f(i) (X i(2) ) = 0 and f(i) ) = 1 (X i(2) ) = 1 (15) and no other possibilities exist. The are in the domain GF(2q ) of the s-boxes S i. Therefore solving equations in (15), and listing all the amounts to defining a truth table of each of the q component boolean functions f (i)

8 No Author Given binary vectors that verify the equation. epresenting truth tables of all f (i) a minimum, O(2 q (q + s)) bits. Let where ξ (1) i, and ξ(2) i, ξ (1) i, = {Xi(1) ξ (2) i, = {Xi(2) f(i) f(i) for all the s-boxes S i require, at (X i(1) ) = α, α GF(2), 0 q 1, 1 i s} (X i(2) ) = β, β GF(2), 0 q 1, 1 i s} (16) are sets of solutions for each f(i). For each of the partitions X i(1) and Xi(2), we want where the common element in each of the sets (16). That is, we want elements of the sets Ξ (1) i and Ξ (2) i q 1 Ξ (1) i = =0 q 1 Ξ (2) i = =0 ξ (1) i, ξ (2) i, (17) for 1 i s. Because of the biectivity property of s-boxes, each of the sets Ξ (1) i and Ξ (2) i will have a distinct q-bit element. Each of these elements is a solution to (15) for 1 i s. Observe that this attack requires that s-boxes satisfy the cryptographic property of completeness, i.e., each output bit from the s-box must be a function of all the input bits to the s-box. Where some output bits are functions of some but not all input bits the intersection sets (17) will not obtain. Ironically then, an SPN whose s-boxes do not satisfy the completeness property will be secure against the proposed attack. Case r = 1. For this case the approach is identical to that of (15), but on linear equations (13). Using the known bit values of X i(1) 1 and X i(2) 1, for 1 i s, we deduce all the linear combinations of X i(1) 0 and X i(2) 0 implied by (13). Here as well, because of the exclusive-or operation, four possibilities will be implied by values of each bit on the left-hand side of (13). 5.1 ecovering the subkeys For r 2 equation (17) gives all the candidate partitions that are consistent with (14). As (15) showed, there are two possibilities for each partition that are consistent with (14). Therefore there are 2 s ways to form each of the blocks X 1 r or X 2 r from their partitions. That is, there are 2 2s ways to evaluate (3) when r = + 1, (8) when r 2, or (13) when r = 1. Subkey recovery begins with K +1. To recover this subkey we create two lists each of length 2 s, and each consisting values of quantities shown on the top row evaluated at each of the 2 s possible concatenations of partitions from (17) for each block. as shown in table below. Each of the bit strings (α 0 α 1 α 2...α n 1 ) v and (β 0 β 1 β 2...β n 1 ) v are distinct binary values of expressions in the top row for each of the 2 s values of X 1 r and X 2 r. C 1 S 1 (X 1(1) ) S 2(X 2(1) )... S s(x s(1) ) C2 S 1 (X 1(2) ) S 2(X 2(2) )... S s(x s(2) ) (α 0 α 1 α 2...α n 1 ) 0 (β 0 β 1 β 2...β n 1 ) 0 (α 0 α 1 α 2...α n 1 ) 1 (β 0 β 1 β 2...β n 1 ) 1 (α 0 α 1 α 2...α n 1 ) 2 (β 0 β 1 β 2...β n 1 ) 2.. (α 0 α 1 α 2...α n 1 ) 2s 1 (β 0 β 1 β 2...β n 1 ) 2s 1 From the two lists we search for two bit strings for which (α 0 α 1 α 2...α n 1 ) w = (β 0 β 1 β 2...β n 1 ) v for some 0 u,w 2 s 1. This value is the subkey K +1 around which the consistency condition (3) was derived. The corresponding values of X 1 and X2 are the valid cipher states which, when encrypted with K +1, results in C 1 and C 2. The argument here is not circular since the consistency condition involved two

Ciphertext-only Cryptanalysis of a Substitution Permutation Network 9 distinct ciphertext blocks from which 2 2 s possible candidates of the th cipher state were deduced, which were then used independently to compute distinct values in each column. With the cipher states X 1 and X 2 known, we proceed similarly to compute X1 r and X 2 r for r 2 and recover each of the subkeys K r. For r = 1 we simply use partitions implied by (13) and form 2 s states that are evaluations of the left-hand side of (11), and 2 s states that are evaluations of the left-hand side of (12). Arranging these states into two lists as previously, we search for entries in the first list that are identical to entries in the second list. Where identical entries are found these are candidates for the subkey K 1, and the corresponding states X 1 0 and X 2 0 are the candidates for input blocks or plaintexts. 6 Algorithm: Outline and Verification This section brings together all the foregoing ideas into a single algorithmic description. We then give an illustration that the SPN architecture facillitates the attack we propose. Lastly we give an analysis that shows the operations count and memory requirements for the algorithm. Algorithm SMASH (Ciphertext-only cryptanalysis). This algorithm computes a key schedule {K +1,K,...,K 1 } and two plaintext blocks P 1 and P 2 for a substitution permutation network given only two ciphertext blocks C 1 and C 2. Additional variables U and V are used as place holders. Set U C 1, V C 2, and r + 1 while r 0 do if r = + 1 then Form the consistency condition U V = S 1 (X 1(1) )S 2(X 2(1) else Form the consistency condition )...S s(x s(1) ) S 1(X 1(2) )S 2(X 2(2) )...S s(x s(2) ) 1 (U V ) = S 1 (X 1(1) )S 2(X 2(1) )...S s(x s(1) ) S 1(X 1(2) )S 2(X 2(2) )...S s(x s(2) ) end if for 1 i s do for 1 q do Solve a system of simultaneous equations u i vi = fi (Xi(1) ) fi (Xi(2) ) end for end for Form two lists of 2 s blocks Xr 1 and Xr 2 by concatenating their partitions Xr i(1) For list 1 evaluate (U S 1 (X 1(1) )S 2(X 2(1) )...S s(x s(1) )) t 1 t 2 s For list 2 evaluate (V S 1 (X 1(2) )S 2(X 2(2) )...S s(x s(2) )) w for 1 t,w 2 s do 1 w 2 s and X i(2) r if (U S 1 (X 1(1) )S 2(X 2(1) )...S s(x s(1) )) t = (V S 1 (X 1(2) )S 2(X 2(2) )...S s(x s(2) )) w then We have the subkey return (U S 1 (X 1(1) )S 2(X 2(1) )...S s(x s(1) )) t end if end for Set r r 1, U X 1 r, and V X 2 r if r = 1 then Form the consistency condition X 1(1) 1 X 1(2) 1 X 2(1) 1 X 2(2) 1... X s(1) 1 X s(2) 1 = X 1(1) 0 X 1(2) 0 X 2(1) 0 X 2(2) 0 X s(1) 0 X s(2) 0 for 1 i s do

10 No Author Given Solve a system of simultaneous equations X i(1) 1 X i(2) 1 = X i(1) 0 X i(2) 0 end for Form two lists of 2 s blocks X0 1 and X0 2 by concatenating their partitions X i(1) 0 and X i(2) 0 For list 1 evaluate (X0 1 X1) 1 t 1 t 2 s For list 2 evaluate (X0 2 X1) 2 w 1 w 2 s for 1 t,w 2 s do if (X0 1 X1) 1 t = (X0 2 X1) 2 w then We have the subkey(s) return (X0 1 X1) 1 t end if end for end if end while 6.1 Verification In this subsection we verify that our algorithm will give the correct key schedule and corresponding plaintext blocks. Let E {K+1,K,...,K 1}() denote the SPN of figure 1, P 1 and P 2 the n-bit plaintext blocks, {K +1,K,...,K 1 } the key schedule, S() all the s s-boxes, and the bit position permutation. Our verification amounts to proving the following lemma. Lemma 1. Suppose C 1 = E {K+1,K,...,K 1}(P 1 ) and C 2 = E {K+1,K,...,K 1}(P 2 ), and let S() S 1 () S 2 ()... S s (). Given only C 1 and C 2, algorithm SMASH recovers P 1 and P 2 with probability 2 s and the entire key schedule with probability 2 2(+1)s. Proof. For the SPN of figure 1 encryption can be expressed as C 1 = E {K+1,K,...,K 1}(P 1 ) = K +1 S(K (S(K 1 (S(K 2 (...(K 2 (S(K 1 P 1 )))...)))))) C 2 = E {K+1,K,...,K 1}(P 2 ) = K +1 S(K (S(K 1 (S(K 2 (...(K 2 (S(K 1 P 2 )))...)))))) For r = + 1 the subkey K +1 is obtained when C 1 C 2 = S(K (S(K 1 (S(K 2 (...(K 2 (S(K 1 P 1 )))...)))))) S(K (S(K 1 (S(K 2 (...(K 2 (S(K 1 P 2 )))...)))))) Since each bit on the left-hand side of (18) implies two possibilities on the right-hand side, it means there are 2 s ways to construct K (S(K 1 (S(K 2 (...(K 2 (S(K 1 P 1 )))...))))) and 2 s ways to construct K (S(K 1 (S(K 2 (...(K 2 (S(K 1 P 2 )))...))))), which will be the same for constructing all the other X 1 r and X 2 r when r 1. Therefore K +1 is obtained with probability 2 2s, and P 1 and P 2 are obtained with probability 2 s 2 s... 2 }{{ s = 2 } s The correct solution to (19) yields X 1 and X2, which we use to obtain K with probability 2 2s. We proceed in the same way for 1 r 1 obtaining each of K r with probability 2 2s. Since there are + 1 subkeys, the probability of recovering the entire correct key schedule is 2 2s 2 2s... 2 }{{ 2s = 2 } 2(+1)s +1 For a given and s this does not appear to be better than exhaustive search since for SPNs we tend to have q. However Lemma 2 in section 7 say we do better. (18)

7 Complexity of the cryptanalysis Ciphertext-only Cryptanalysis of a Substitution Permutation Network 11 Time. For our algorithm, time goes into solving systems of non-linear boolean equations by logical deduction, constructing lists of cipher states without duplications, evaluating subkeys expressions at different cipher states, and searching for subkeys in the resulting lists. Assuming a serial computer and in each round, for each 1 i s and each 1 q we perform 2 q comparisons, resulting in 2 q sq 2, which is 2 q+1 n comparisons. We then perform 2 s 2 concatenations constructing two list, and spend a further 2 s 2 on evaluation at all the cipher states. To find a subkey we perform 2 2s comparisons. In all this is performed + 1 times, resulting in the time complexity of or an average of ( + 1)(2 q+1 n + 2 s+2 + 2 2s ) (19) ( + 1)(2 q n + 2 s+1 + 2 2s 1 ) (20) times. An improvement on this can be obtained on a parallel machine with s processors. Each processor can then be assigned a system of q equations centered at s-box S i, and solve these serially. In this case, solving equations takes 2 q+1 q. Constructing list takes 2 s+1 /s concatenations. Evaluations take a further 2 s+1 /s. Finding a subkey now takes 2 2s /s. In total we have the running time On average this is ( + 1)(2 q+1 q + (2s+2 + 2 2s ) ) (21) s ( + 1)(2 q q + (2s+1 + 2 2s 1 ) ) (22) s Even when n = 128, with s and q suitably chosen, these quantities ((19) to (22)) are not embarrassingly huge. This says the ciphertext-only attack against an SPN of figure 1 is quite efficient. Memory. Truth tables for all the systems of equations take up 2 q+1 n bits. In addition lists of concatenated blocks take up 2 s+1 n bits. Lists from which subkeys are deduced take a further 2 s+1 n bits. Lastly, there is 2n bits of the input data for each iteration. The total memory required is 2 q+1 n + 2 s+2 n + 2n (23) bits, which is polynomial in n. This also is modest even when n = 128 with q and s suitably chosen and taking into account constraints on the design of practical s-boxes. Lemma 2. There are only 2 s+1 n-bit binary vectors admissible to the consistency condition. Proof. Each of the n-bit vectors Xr 1 and Xr, 2 r 1, can be formed in 2 s ways from partitions implied by the consistency condition, leading to 2 s 2 = 2 s+1 binary vectors only. Lemma 2 says that we need not consider all the other 2 n 2 s+1 vectors that are not implied by the consistency condition, which means that of a huge ( ) 2 n 2 n! = (2 n 2 s+1 )!2 s+1 (24)! 2 s+1 possibilities, the information we have in the form of two ciphertext blocks is sufficient to eliminate a large number of possibilities, and the SPN design is such that this approach guides us to a single set of such binary vectors that is most likely. This is a huge saving in storage and time over an exhaustive search that would require an examination of all the possibilities as implied by lemma 1.

12 No Author Given A generic SPN. Consider an SPN cipher (of figure 1) for which n = 64, s = 8, q = 8, and = 8. Assuming the s-boxes distinct, storage for truth tables of all the s-boxes takes up 8 2 8 2 8 = 2 15 bits, or 4 kilobytes. In addition lists of cipher states formed from partitions will consume 2 8+1 64 bits of storage. A further 2 8+1 64 is taken up by lists from which subkeys are determined. Lastly there is 2 64 = 2 7 bits of input data to each round. Total storage required is therefore 12.04 kilobytes, which remains constant for all + 1 r 1. This is insignificant compared to storage required for attacks based on popular cryptanalysis models. By (20) we see that the total number of times that the algorithm gets executed to solve the problem is exponential in q and s but not in the block size n. For the given parameters, this number is 9(2 15 +2 14 +2 9 ), which is also insignificant compared to the time complexities of the more popular approaches. A characteristic feature of our attack is that it depends only on the number of s-boxes and size of their inputs, so that both its memory and time complexities increase with an increase in the number of s-boxes or increase in their size. 8 Conclusion We presented a new and simple ciphertext-only attack against a conventional substitution permutation network. The attack is realistic in that it requires the minimum possible amount of data; which is two ciphertext blocks. The strength of the attack is the weakness exhibited by the design of the conventional SPN structure. The attack is based purely on logical deduction and thus is immune to defenses that might be effective against those cryptanalytic attacks that have a statistical bent. Necessary and sufficient prerequisites for this attack are biectivity and completeness of s-boxes 4. Because the binary representation of an s-box is all that the attack requires, its complexity remains the same even as the number of s-box design criteria tends to infinity. We also showed that our attack is very efficient in terms of its time complexity and memory requirements. eferences 1. Heys, H.M., Tavares, S.E.: The Design of Substitution-Permutation Networks esistant to Differential and Linear Cryptanalysis. CCS 94, pp 148-155, (1994) 2. Biham, E., Shamir, A.: Differential Cryptanalysis of DES-like Cryptosystems. Journal of Cryptology, vol.4, No.1, pp 3-71, (1991) 3. Matsui, M.: Linear Cryptanalysis Method for DES Cipher. Advances in Cryptology - EUOCYPT 93, vol. 765, pp 386-397, (1994) 4. Pieprzyk, J.: Cryptographic Algorithms: Properties, Design, and Analysis. http://citeseer.ist.psu.edu, pp 1-15, (1996) 5. Mitchell, C.: Enumerating Boolean Functions of Cryptographic Significance. Journal of Cryptology, pp 155-170, (1990) 6. Kam, J.B., Davida, G.I.: Structured Design of Substitution-Permutation Encryption Networks. IEEE Trans. Computers, vol. C-28, No. 10, pp 747-753, (1979) 7. Gordon, J.A., etkin, H.: Are Big S-boxes best?. Advances in Cryptology - EUOCYPT 82, pp 257-262, (1983) 8. Knuth, D.E.: The Art of Computer Programming: Introduction to Combinatorial Algorithms and Boolean Functions. Addison-Wesley, Fascicle 0, vol. 4, (2008) 9. Webster, A.F., Tavares, S.E.: On the Design of S-boxes. pp 523-534, (1998) 10. Millan, W., Dawson, E.P., O Connor, L.J.: Cryptanalysis of Tree-structured Ciphers. Electronics Letters, vol. 30, No. 12, pp 941-942, (1994) 11. Heys, H.M., Tavares, S.E.: Cryptanalysis of Tree-structured Substitution-Permutation Networks. Electronics Letters, vol. 29, No. 1, (1993) 12. Heys, H.M., Tavares, S.E.: Known Plaintext Cryptanalysis of Tree-structured Block Ciphers. Electronics Letters, vol. 31, No. 10, pp 784-785, (1995) 13. O Connor, L.: A Differential Cryptanalysis of Tree-Structured Substitution-Permutation Networks. IEEE Trans. Computers, vol. 44, No. 9, pp 1150-1152, (1995) 4 Akin to udo, it exploits the strength of the opponent to its advantage.