Thesis Research Notes Week 26-2012 Christopher Wood June 29, 2012 Abstract This week was devoted to reviewing some classical literature on the subject of Boolean functions and their application to cryptography. 1 Perfect nonlinear S-boxes [1] This paper deals with the fundamental mathematical properties of bent functions and their role in constructing perfect nonlinear permutations (S-boxes). It is comprised of three main sections as follows: 1. Bent functions, properties, and constructions 2. Perfect nonlinear transformations 3. Efficient construction mechanisms (targeting actual implementations) 1.1 Informal Discussion of Bent Functions Before getting into formal definitions of bent functions, we first discuss some basic ideas around their definition and properties. First, a bent function is a special type of Boolean function that is named as such because they are as different as possible from all linear and affine functions. Clearly, this is an important property of cryptographic non-linear functions, as functions that behave linearly are susceptible to linear and differential cryptanalysis. 1.2 Some Properties of Bent Functions Most definitions of bent functions (at least in the context of Nyberg s discussion) rely on the Fourier transformation. Several other properties, such as the shifted cross-correlation, are utilized as well. For completeness, we define them all as follows: Definition 1. The Fourier transform of u (the qth root of unity in C is defined as follows F (w) = 1 q n x Z n q u f(x) w x, w Z n q 1
This defintion is defined with respect to C partly due to its simplicity in notation (i.e. we make use of Euler s identity e ix = cos(x) + i sin(x)); we could just have easily defined it with respect to R using sines and cosines. Definition 2. The shifted cross-correlation of two functions f and g from Z n q to Z q is defined as follows: c(f, g)(w) = 1 q n x Z n q u f(x+w) g(x) We now utilize this transformation to define bent functions. Definition 3. A function f : Z n q Z n q is bent if F (w) = 1 for all w Z n q. This definition was first given in [?]. This definition contains a lot of subtle information and needs to be explored in more detail. 2 Fourier Transform and its Applications in Cryptography In [2], Massey discusses the practical applications of the Discrete Fourier Transform (DFT) in both coding theory and cryptography. In doing so, he first describes the usual DFT in terms of fields as follows. Definition 4. Let u be a primitive Nth root of unity in the field F (i.e. u N = 1 but u i 1 for 1 i < N). The Discrete Fourier Transform of length N generated by u is the mapping DF T u ( ) from F N F N defined by B as follows B[i] = N 1 n=0 [n]u in In this context, = (b[0], b[1],..., b[n 1]) is the time domain sequence and B = (B[0], B[1],..., B[N 1]) is the frequency domain sequence. The inverse of this transform b is as follows b[i] = 1 N N 1 i=0 B[i]u in With this transformation it is common to measure the linear complexity of sequences, where the linear complexity is defined as the length L of the shortest linear feedback shift register (LFSR) that produces the entire sequence (of size n) when loaded with the sequence of size L. In particular, the DFT is used to analyze sequences of produced by nonlinear combinations of sequences. 2
3 Linear Cryptanalysis and S-box Analysis This is a slight change of pace from the other papers read this week, but something that aligns with the overall goal of the thesis. In particular, I hope to use this tool in the analysis of S-boxes. In this brief note I review the concepts discussed by Stinson in [3]. Generally speaking, the idea of linear cryptanalysis is to attempt to find some linear relationship between a subset of the plaintext bits and a subset of the state bits before the last XOR operation in the cipher algorithm that is satisfied with some probability p. Then, using a large set of plaintext/ciphertext pairs, the attacker then decrypts each ciphertext block using all possible canidate keys (which come from the last round of encryption determined from the XOR operation). While doing so, a counter associated with each candidate key is incremented every time the bit subset relationship is satisfied. Then, when finished decrypting all given ciphertexts, the attacker chooses the candidate key that has a frequency count furthest from 1/2 times the number of plaintext/ciphertext pairs. Before getting into more specific details, we first review some fundamental rules from probability theory that are used to justify (and prove ) the correctness of these attacks. 3.1 Piling Up Lemma Lemma 1. ( The Piling Up Lemma) Let ɛ i1,i 2,...,i k denote the bias of the random variable X i1 X ik. Then ɛ i1,i 2,...,i k = 2 k 1 In the context of this lemma, all of the X i random variables can only hold values from the set {0, 1}, where Pr[X i = 0] = p i, and we make assert that all random variables X i, X j, i j, are independent. With these conditions, we can easily formulate the PDFs for each random variable, and subsequently the distributions of X i X j. The proof of the Lemma 1 comes from a straightforward induction argument on k, where k = 1 and k = 2 are two unique base cases (the singleton variable and the result of one XOR between two random variables). From this lemma we also obtain the following corrolary. Corollary 0.1. Let ɛ i1,i 2,...,i k denote the bias of the random variable X i1 X ik. Suppose that ɛ ij = 0 for some j, then ɛ i1,i 2,...,i k = 0. 3.2 Linear Approximations of S-boxes Let π S : {0, 1} m {0, 1} n be the S-box in question. Each element of the input vector x i corresponds to some random variable X i, and each of these variables are independent. In addition, each element 3 k j=1 ɛ ij
of the output vector y i corresponds to some random variable Y i. However, these variables are not specifically independent from each other or from the X i variables. One approximation method of the S-box might entail computing the bias of the random variable X i1 X ik Y j1 Y jl It seems as though such a computation cannot rely on the Piling Up Lemma discussed in the previous section because the random variables are not strictly independent. It would be an interesting exercise to analytically solve for the bias. 4 Unkown Terms Cross-correlation Definition 5. For two continuous functions f and g, the cross-correlation is defined as: (f g)(t) = where f is the complex conjugate of f. f (τ)g(t + τ)dτ, This quanity is a measure of similarity of two waveforms as a function of time-lag applied to one of them. In terms of cryptography, this can be utilized as a pattern recognition tool, which turns out to be particularly helpful when performing cryptanalysis. This idea is similar in nature to the convolution of f and g. 5 Ideas Would it be feasible to replace the S-box in Rijndael with a nonlinear recurrence relation that has similar security properties? Conversely, what can we gain by representing the current S-box as a recurrence relation? What is the linear complexity of the Rijndael S-box? LFSRs are commonly used to construct stream ciphers and pseudo-random number generators (PRNGs). In particular, combinations of multiple LFSRs that are evaluated in parallel are often used to create non-linear output sequences. Is this a viable technique for block ciphers? Another technique is referred to as non-linear updating, in which the memory of the generating function (i.e. the transition function between two states) is partitioned into two bit sets of size l 1 and l 2 (where l 1 + l 2 = l, the length of the memory) and two update functions f 1 : {0, 1} l 1 {0, 1} l 1 and f 2 : {0, 1} l {0, 1} l 2 are applied (where f 1 is linear and f 2 is nonlinear and uses some of the nonlinear bits in l 2 ) [4]. It would be interesting to compare 4
the security of such a construction in Rijndael, where the SubBytes routine is now composed of a non-linear S-box and a linear LFSR-equivalent function. References [1] K. Nyberg, Perfect nonlinear s-boxes, in EUROCRYPT, 1991, pp. 378 386. [2] J. L. Massey, The discrete fourier transform in coding and cryptography, in IEEE Inform. Theory Workshop, ITW 98, 1998, pp. 9 11. [3] D. R. Stinson, Cryptography - theory and practice, ser. Discrete mathematics and its applications series. CRC Press, 2006. [4] E. Zenner, Cryptanalysis of lfsr-based pseudorandom generators a survey, Tech. Rep., 2004. 5