Mastermind – five questions do suffice

Sebastian Pokutta's Blog 2018-03-12

Today I would like to talk about the Mastermind game and related (recreational?!) math problems – the references that I provide in the following are probably not complete. Most of you might know this game from the 70s and 80s. The first player is making up a secret sequence of colored pebbles (of a total of 6 colors) and the other player has to figure out the sequence by asking questions about the code by proposing potential solutions. The first player then indicates the number of color matches.

Mastermind (source: Wikipedia)

More precisely, Wikipedia says:

The codebreaker tries to guess the pattern, in both order and color, within twelve (or ten, or eight) turns. Each guess is made by placing a row of code pegs on the decoding board. Once placed, the codemaker provides feedback by placing from zero to four key pegs in the small holes of the row with the guess. A colored (often black) key peg is placed for each code peg from the guess which is correct in both color and position. A white peg indicates the existence of a correct color peg placed in the wrong position.

If there are duplicate colours in the guess, they cannot all be awarded a key peg unless they correspond to the same number of duplicate colours in the hidden code. For example, if the hidden code is white-white-black-black and the player guesses white-white-white-black, the codemaker will award two colored pegs for the two correct whites, nothing for the third white as there is not a third white in the code, and a colored peg for the black. No indication is given of the fact that the code also includes a second black.

Once feedback is provided, another guess is made; guesses and feedback continue to alternate until either the codebreaker guesses correctly, or twelve (or ten, or eight) incorrect guesses are made.

In a slightly more formal way, we have a string in $\{1,...,6\}^4$ and the “decoder” wants to reconstruct this string by inferring from the provided feedback. One of the natural questions that arise is of course how many questions do suffice. Knuth [Knuth76] then showed that five questions suffice to be able to always reconstruct the secret string. What is interesting about the proof is that it is a “table” – basically output of a computer program. This lookup table can be used so find a next question at any given point. The table is a greedy optimization in some sense: “Figure 1 [the lookup table] was found by choosing at every stage a test pattern that minimizes the maximum number of remaining possibilities, over all 15 responses by the codemaker”.

Later in 1983, Vasicek Chvátal dedicated a paper on the Mastermind game to Paul Erdős for his 70th birthday. Chvátal looked at generalized admissible Mastermind vectors denoted by $V(n,k)$ of vectors of length n with k different colors. It is not too hard to see that the minimum number of questions $f(n,k)$ needed to correctly identify any string in $V(n,k)$ is bounded from below by

$f(n,k) \geq \frac{n \log k}{\log \binom{n+2}{2}}$

which arises from the fact that there are only $\binom{n+2}{2}$ different answers and $n^k$ different strings have to be distinguished. Complementing this bound, Chvátal showed that the number of questions needed to be asked without waiting for the answer (i.e., the questions are asked in one go, then the answers to all questions are provided at once, and then the code has to be uniquely identified) can be bounded from above as follows: the number of questions needed for this static case will be denoted by $g(n,k)$ and for any $\epsilon > 0$ there exists $n(\epsilon)$ so that for all $n > n(\epsilon)$ and $k < n^{1-\epsilon}$ we have

$g(n,k) \leq (2+ \epsilon) n \frac{1+2 \log k}{\log n - \log k}$

and clearly we have $f(n,k) \leq g(n,k)$ . The proof uses the probabilistic method in a nice way. Moreover, Chvátal also provides some upper and lower bounds for special cases. Those of you guys that know about my addiction to the Chvátal-Gomory closure and its friends might have already guessed that this is exactly how I came across the problem…

The latter problem where we do not wait for the answers is usually called the static mastermind problem whereas the classical version is called the dynamic mastermind problem. Later in 2003 and 2004 Goddard (see [Godd03,04]) provided optimal values for the minimal number of questions to be asked both in the dynamic as well as static case and also for the average number (denoted by $r(n,k)$ ) of questions needed whenever the secret string is uniformly picked at random. With the notation from above we have the following number of questions (tables taken from [Godd03,04]):

For the average number of queries needed ( $r(n,k)$ ) we obtain:

Positions234567Colors2 – 22.2502.7503.0313.5003.8753 – 2.3332.7043.0373.3584 – 2.8133.2193.5355 – 3.2403.6083.9416 – 3.6673.9544.3407 – 4.0414.2978 – 4.4389 – 4.79010- 5.170

and similarly for the dynamic case we have the following minimum number of queries $f(n,k)$ :

Positions2345678Colors2 – 33445563 – 44445<= 64 – 4445<= 65 – 555<= 66 – 5557 – 66<= 68 – 69 – 710- 7

and for the static case $g(n,k)$ we have the following table. Note that in the table below the final “query” that states the recovered string is not counted as in comparison to the ones above. Therefore in order to compare the values with the ones above you need to add “1” to each entry.

Positions2345678Colors2 – 22334553 – 23344<= 54 – 34455 – 4456 – 4567 – 56<= 78 – 67<= 89 – 6810– 79

(there seems to be a typo for n = 2 and k = 3 in one of the tables, as the static case has a better performance than the dynamic case which is not possible).

In order to be able to actually check (with a computer) whether a certain number of questions suffices, we have to exclude symmetries in a smart way. Otherwise the space of potential candidates is too large. In this context, in particular the orderly generation framework of [McKay98] is very powerful. The idea behind that framework is to incrementally extend the considered structures in such a way that we only add a canonical candidate per orbit. Moreover, after having extended our structure to the next “size” we need to check whether it is isomorphic to one of the previously explored structures. In this case we do not consider it. For each candidate we check whether the number of distinct answers is equal to the total number of possible secret codes. In this case there is a bijection between the two and therefore we can decode the code. However it is not clear that this bijection needs to have a “nice” structure or that it is “compact” in some sense.

References:

[Knuth76]: Knuth, D.E. 1976. “The computer as a master mind.” Journal of Recreational Mathematics. http://colorcode.laebisch.com/links/Donald.E.Knuth.pdf (Accessed June 9, 2011).
[Chvátal83]: Chvátal, V. 1983. “Mastermind.” Combinatorica 3: 325-329.
[McKay98]: McKay, B.D. 1998. “Isomorph-free exhaustive generation.” Journal of Algorithms 26(2): 306–324.
[Good03]: Goddard, W. 2003. “Static Mastermind.” Journal of Combinatorial Mathematics and Combinatorial Computing 47: 225-236
[Godd04]: Goddard, W. 2004. “Mastermind Revisited.” Journal of Combinatorial Mathematics and Combinatorial Computing 51: 215-220