Meet-in-the-middle attacks on block ciphers
Wildon's Weblog 2019-08-20
A mathematical model for a block cipher is a collection of invertible functions
indexed by keys
. We say that the block cipher
has block size
and key length
. For example, DES has
and
, while its successor AES has
and
(or
or
for the super-paranoid).
Let and
be block ciphers of block size
and key lengths
and
, respectively. An easy way to boost the key length to
, while keeping essentially the same cryptologics, is to compose the encryption functions from the two ciphers. Thus for
we define
Perhaps surprisingly, when , this cryptoscheme offers only
bits of security, rather than the expected
. The purpose of this post is to give a fairly precise costing of the meet-in-the-middle attack behind this.
As a notational convention, denotes quantities unknown to the cryptanalyst and
quantities chosen or observed by the cryptanalyst. We cost by the number of encryption/decryption operations, or cryptops, justifying this assumption in the course of the argument.
Let be the (unknown) key. Let
be a chosen plaintext and let
be its observed encryption. (We write
rather than
to emphasise that in the chosen plaintext model, we can perform two-stage encryption using the unknown key pair
, but not each stage separately.) We denote by
the inverse of the encryption function
. Let
Thus ,
and
cryptops are required to get this far.
Using any reasonable algorithm, sorting by the second entry in each pair takes at most
primitive operations for some smallish
. Similarly
is sorted using
primitive operations. Working through the sorted lists we pair up the
with the
, for each
. In one possible implementation this makes a list of pointers to the relevant members of
and
for each
, so the cost in primitive operations is
for some small . Since each cryptop for
must produce
bits, and similarly for
, and typically
, it seems reasonable to assume that the cost of
cryptops is significantly more than the cost of this sorting and pairing up.
We now have the sets
for each . Observe that
is those key pairs (for the composed cryptosystem) that meet-in-the-middle at
. Fix
distinct plaintexts
For each we encrypt
using the guessed key pair
, and check if
for each . If all
encryptions agree, we assume that
. (But for simplicity, let’s say we carry on checking all the other keys.) The number of pairs to be checked in the final step is
and each pair requires two encryptions, making the cost of the final step cryptops. The total cost in cryptops is therefore
.
We now estimate for the random cipher, and use this as an approximation to DES and to AES.
Random cipher
Suppose that each and
is chosen uniformly and independently at random from the
permutations of
. We know that
and for each other key
, the probability is
that
. Dually, we know that
, and for each other key
, the probability is
that
. Therefore
and
are independently distributed as the shifted binomial distribution
Similarly, if then
and
are independently distributed as
. (It is worth noting that
and
are not independent: for example we have
where the union is disjoint, therefore
.) It follows by linearity of expectation that
This is very well approximated by . Hence the total cost is estimated as
It remains to choose a suitable value for . By our randomness assumption,
and
are each permutations of
chosen uniformly at random subject to the constraint
. Thus for each chosen plaintext
, the ciphertexts
and
are uniformly distributed on
. The probability they agree is therefore
. More generally, the probability that
for all of the
chosen
is
We have to check key pairs, so the expected number of ‘false’ keys is very nearly
; by the good approximation for
above, this is very nearly
Suppose for simplicity that . Setting
, our estimate for the total work is then
cryptops, where should be chosen so that
is large compared to
. In particular, if
is very large then, as one would expect, then one can take
: no checking cryptops are needed.
DES as a random cipher
For DES, ,
, so
,
and we need
large compared to
. Taking
, the total work is
. Clearly for any reasonable choice of
, the cost of the attack is dominated by the first phase, and is about
cryptops. Even in 2009, using a special purpose device, this took at most
days.
AES as a random cipher
For AES with the shortest key length, , so
and we need
large compared to
. Again taking
, the total work is
, roughly balanced between the first and second phases. For AES with the longest key length,
and
, so
and we need
large compared to
. Taking
, the total work is
. The second phase dominates. Of course neither attack is practical.
A cipher chosen to have many collisions in the middle
Let and define
by
. (This is multiplication in the finite field: we will ignore that the encryption functions
are not injective.) Define
by
. The set
of keys that meet-in-the-middle at
is
Again we ignore the vanishing small chance that . Hence
The estimate for the cost is
Using the heuristic for from the random cipher, we have
, so we need
large compared to
. Taking
the total cost is
. In this case the chance is
that the true key is found, but since the second components of each half of the encryption key are irrelevant, this will not worry the attacker. One reason for not using
is that the composed encryption functions for keys
are then the same whenever
, giving a further collapse. For the composed cipher as defined, since the affine group on
is sharply
-transitive, we could in fact have taken
.