Difference attacks on toy block ciphers

Wildon's Weblog 2019-08-20

After setting an examination for my Cipher Systems course that turned me from hero to zero in the eyes of the students (you can guess why), I’m somewhat disillusioned with the subject. One unavoidable issue is that it’s hard to isolate the interesting mathematical ideas from the forest of notation and definitions needed to define a modern block cipher such as AES. With a few honorable exceptions, it seems most textbooks don’t even try.

As block ciphers go, AES is a relatively elegant design: its cryptographic strength comes ultimately from the pseudo-inverse function P : \mathbb{F}_2^8 \rightarrow \mathbb{F}_2^8 defined by identifying \mathbb{F}_2^8 with the finite field \mathbb{F}_{2^8} and then inverting non-zero elements. (The zero word is the unique fixed point of P.) In a typical round the input in \mathbb{F}_2^{128} is split into 16 bytes and the non-linear function P is applied to each byte; the bytes are then combined and mixed up by two \mathbb{F}_2-affine permutations, and these are followed by a final key addition in \mathbb{F}_2.

Stream ciphers seem to be even messier: I asked without great success on crypto.stackexchange.com for a simply defined but not-trivially-breakable stream cipher I could use in my course.

As a toy model for a difference attack in my course, I used the block cipher with plaintext and ciphertexts \mathbb{F}_2^8, key space \mathbb{F}_2^8 \times \mathbb{F}_2^8 and encryption functions defined by

e_{(k,k')}(x) = P(x+k) + k'.

Suppose you know a single plaintext/ciphertext pair (x,z). For each guessed first key k_\star there is a unique k'_\star such that e_{(k_\star,k'_\star)}(x) = z, namely

k'_\star = P(x+k_\star) + z.

Thus a single known plaintext/ciphertext pair does not reveal either component of the key. To motivate the attack in this post, let us consider what two pairs (x,z) and (x+\Delta, z+\Gamma) reveal. To save space, write x_\Delta for x + \Delta. Suppose that (k_\star, k'_\star) is a possible key:

\begin{aligned} x &\mapsto x + k_\star\hskip7.5pt  \mapsto P(x+k_\star) \hskip7pt \mapsto P(x+k_\star) + k'_\star \hskip8pt = z \\ x_\Delta &\mapsto x_\Delta + k_\star \mapsto P(x_\Delta + k_\star) \mapsto P(x_\Delta+k_\star) +k'_\star = z + \Gamma.\end{aligned}

A little thought shows that another possible key is (k_\star + \Delta, k'_\star + \Gamma):

\begin{aligned} x &\mapsto x_\Delta + k_\star \mapsto P(x_\Delta+k_\star) \mapsto P(x_\Delta+k_\star) + k'_\star + \Gamma = z \\ x_\Delta &\mapsto x + k_\star\hskip7.5pt  \mapsto P(x + k_\star)\hskip7pt \mapsto P(x+k_\star) +k'_\star +\Gamma\hskip8pt  = z + \Gamma.\end{aligned}

Observe that the differences of top and bottom are in both cases

\Delta \mapsto \Delta \mapsto \Gamma \mapsto \Gamma.

This illustrates the critical point that differences are invariant under key addition, but typically changed by non-linear functions such as P. By the lemma below, for any non-zero input difference \Delta \in \mathbb{F}_2^8, the output difference

P(x+k_\star) + P(x + k_\star + \Delta)

can be 127 of the 256 elements of \mathbb{F}_{2}^8. This is usually taken as a sign of the cryptographic strength of P. But here it means that one can (apart from in one case) determine the pair \{x+k_\star,x+k_\star+\Delta\} from \Delta and \Gamma. For the toy cipher, this pair is \{x+k, x+\Delta+k\}, and so we determine \{k, k+\Delta\}, leaving exactly two possible keys.

Differences through pseudo-inverse.

Lemma. Let \delta \in \mathbb{F}_{2^8} \backslash \{0\}. If \gamma \in \mathbb{F}_{2^8} \backslash \{0,1/\delta\} then the equation p(\alpha) + p(\alpha+\delta) = \gamma has exactly zero or two solutions.

Proof. If \alpha \in \{0,\delta\} then

p(\alpha)+p(\alpha+\delta) = p(0) + p(\delta) = 0 + 1/\delta = 1/\delta.

We may therefore assume that \alpha\not\in \{0,\delta\} and so p(\alpha) = \alpha^{-1} and p(\alpha+\delta) = (\alpha+\delta)^{-1}. Now \alpha^{-1} + (\alpha+\delta)^{-1} = \gamma if and only if \delta = \gamma \alpha (\alpha + \delta). This is a quadratic equation in \alpha not having a repeated root. \Box.

(In the exceptional case \gamma = 1/\delta, there are four solutions, namely 0, 1, and the two roots of (\alpha/\delta)^2 + (\alpha/\delta) + 1 lying in \delta \mathbb{F}_{2^2}^\times \subseteq \mathbb{F}_{2^8}. In fact, since (\alpha/\delta)^2 + (\alpha/\delta) + 1/\gamma\delta has 2 roots if and only if 1/\gamma\delta \in \{\beta^2 + \beta : \beta \in \mathbb{F}_2^8\}, which is a co-dimension 1 subspace of \mathbb{F}_{2^8}, there are exactly 127 output differences \gamma for each non-zero input difference \delta, each except 1/\delta arising from exactly two inputs \alpha.)

In my course I spent an entire lecture on the difference attack, simplifying the lemma by only considering \Delta = 0000\,0001, corresponding to \delta = 1 \in \mathbb{F}_{2^8}, and omitting the parenthetical paragraph entirely. Still I’m fairly sure no-one except the keenest M.Sc. students got the idea. Probably I will scrap it next year, and instead give an example from a different toy block cipher with a more easily defined S-box. Or maybe, since there is no chance of universal comprehension, I should just leave it to a problem sheet, with a hint that the lemma above follows from Hilbert’s Theorem 90. (Joke.)

Variations on pseudo-inverse

  • The exceptional case \gamma=1/\delta is annoying. It arises because X^2+X+1 has a root in \mathbb{F}_{2^8}. Over any \mathbb{F}_{2^c} with c odd, there are 2^{c-1} output differences for any input difference \gamma. For instance, over \mathbb{F}_{2^2} we only see the exceptional case (and even more embarrassingly, the pseudo-inverse function p is linear), while over \mathbb{F}_{2^3}, defined by \beta^3 + \beta = 1, the output differences for input difference 1 are \{1,1+\beta, 1+\beta^2, 1+\beta+\beta^2\}; these are the (no-longer-so) exceptional \gamma = 1, and the reciprocals of the non-zero elements in the subspace

    \{\alpha^2 + \alpha : \alpha \in \mathbb{F}_{2^3} \} = \langle \beta, \beta^2\rangle.

  • It is mathematically tempting to replace the tricky finite field \mathbb{F}_{2^8} with \mathbb{F}_p for p an odd prime. Taking the input difference \delta, now applied (and extracted) using arithmetic in \mathbb{F}_p, we have

    p(x+1) - p(x) = 1/(x+1) - 1/x = 1/x(x+1)

    for x\not\in \{-1,0\}. Quadratic equations are now more easily understood: we can just complete the square to see that exactly (p-1)/2 of the elements \mathbb{F}_p^\times are output differences; again the output difference 1 has double the probability of the others.

  • I hoped that \mathbb{Z}/2^n\mathbb{Z} might give a nice example, using the function q such that q(2x) = 1/(2x+1) -1 and q(2x+1) = 1/(2x+1). This function is weak respect to the input difference of 1, since, by definition, q(2x+1)-q(2x) = 1 for any x. Perhaps surprisingly, this makes 1 less suitable for the attack. The situation for even input differences is somewhat messy. Clearly this is a non-starter for my course.
  • It’s tempting to try s(\alpha) = \alpha^2. But because of the identity

    s(\alpha + \delta) = (\alpha + \delta)^2 = \alpha^2 + \delta^2 = s(\alpha) + \delta^2,

    an input difference of \delta transitions deterministically (i.e. the input \alpha is irrelevant) to an output difference of \delta^2. One could replace the one-time-pad key addition in my toy block cipher with s and run essentially the same attack, but as a direct replacement for pseudo-inverse, squaring has completely the wrong properties.

  • Given this, one might try r(\alpha) = \alpha^3, working in \mathbb{F}_{2^c}. We have

    (\alpha+1)^3 + \alpha^3 = \alpha^2 + \alpha + 1,

    so by the argument already seen, there is a affine codimension 1 subspace of output differences. Unfortunately cubing is only invertible if c is odd; this is the case when the subspace is properly affine and rules out \mathbb{F}_{2^8}.