Unspeakable bets: take small steps
Win-Vector Blog 2014-01-06
I was watching my cousins play Unspeakable Words over Christmas break and got interested in the end game. The game starts out as a spell a word from cards and then bet some points game, but in the end (when you are down to one marker) it becomes a pure betting game. In this article we analyze an idealized form of the pure betting end game.
A simplification (and deliberate alteration) of the end game of Unspeakable Words is as follows. You have two players “a” and “b”. The state of the world is how many points a has, how many points b has and whose move it is. The goal of the game is to take your own point count to zero on your move. There is also a 20 sided die, but for this writeup we will use a n=6 sided die. When it is your move you can bet any number up to your number of points (but no greater than the largest number on the die). You then roll the die. If the die is comes up larger than your bet you can remove that many points from your total and play continues with the other player. If the die is less than or equal to your bet you lose (“bust”) and the other player wins. We have translated the game into something that is deliberately “nim like” (ala Berlekamp, Conway and Guy “Winning Ways”) except for the retention of uncertainty (the die rolling).
The natural question is: what is a good betting strategy?
My first instinct is you might want to make big bets. The intuition would be that when you bet k you have a (1-k/n) chance taking k points off your total so you could think of a k point bet as having an expected value of (1-k/n)*k. Because the expected value of a k point bet is less than k points (due to the chance of busting) the situation reminds me of the being forced try and get a $1000 profit in an unfair (typical) casino. For the casino problem the usual solution is find the game that is closest to fair and bet all in as large increments as allowed. The idea is to minimize the number of times you play (since each play is disadvantageous) so as to maximize your chance of having an atypical or profitable run.
It turns out the large bet intuition is in fact wrong. For Unspeakable Words, late in the game you want to make small bets as you need to value avoiding ruin (losing your last marker or busting) more than finishing your point goal. The analogy that actually holds is derived from the following.
For the moment let’s ignore the other player. Let’s work on the simpler game with a single player who either gets rid of all of their points (to win) or busts (and loses). For this game the state is just the number of points and we just need to know what size bet to make in each state. By induction we can show the optimal bet is always 1 point (until you have won or lost). So there is an argument for small bets.
We will use induction to show small (1 point) bets are optimal.
When a=1 (you have 1 point to get rid of) the only allowed bet is 1, so we have established a base case.
Now suppose we know for states a=1…k that the optimal bet is 1 point. We wish to show this is also the case for a=k+1. Now a bet of d-points would move us from state k+1 to state k+1-d with probability 1-d/n and immediately lose with probability d/n. So the expected value of a d-point bet from state k+1 (assuming we are encoding winning as 1 and losing as 0) is: (1-d/n)*value(k+1-d) where value(k+1-d) is the expected of winning from state k+1-d under optimal play. We know d>=1 so k+1-d <= k and we can apply our induction hypothesis that says optimal play from that point on is 1-point bets. So the value(a) for any a<k+1 is (1-1/n)^(a-1) (the odds of surviving a-1 1 point bets in a row). So the value of a d-point bet from state k+1 is exactly (1-d/n) (1-1/n)^(k+1-d). If d>1 we also know the value of a d-1 point bet from state k+1 is exactly (1-(d-1)/n) (1-1/n)^(k+1-d+1). We can show the d-1 point bet value is larger than the value of a d-point bet from state k+1, meaning if d>1 we should always make a smaller bet.
So once we confirm (1-d/n) (1-1/n)^(k+1-d) < (1-(d-1)/n) (1-1/n)^(k+1-d+1) we have proven that we should always bet one point (which was our goal). A little cancellation shows this is the same as checking if 1-d/n < (1-(d-1)/n)(1-1/n) which in turn is equivalent to checking if 1-d/n < 1 – (d-1)/n – 1/n + (d-1)/n^2, which finally is equivalent to the true statement 0 < (d-1)/n^2.
So if you are not in a rush (when there is no second player trying to win) you want to make small bets. Another observation is the optimal value of a state value(a) = (1-1/n)^(a-1) is a form that is familiar to theorists. We are going to have value(a) is approximately e^((a-1)/n). Or in particular value(n) is near e^(-1) which is 0.367879 and less than 1/2 (but greater than the odds of winning it all in one large bet). So you don’t want to play this game for large a, you are more likely to bust than to get rid of your points. This conclusion is even stronger if you play the variation of the game where you bust only if the die rolls less than your bet (because then bets of size 1 never lose, so unless you are rushed it is very obvious there is no reason to not make 1 point bets).
Let’s get back to the 2-player game. You are going to be forced to make bigger bets if your opponent has fewer remaining points. For n=6 the following table is the optimal player-a turn bets given the row is a’s state and the column is b’s state:
aMove b:0 b:1 b:2 b:3 b:4 b:5 b:6 a:0 0 0 0 0 0 0 0 a:1 1 1 1 1 1 1 1 a:2 1 2 2 2 2 1 1 a:3 1 3 3 3 1 1 1 a:4 1 4 4 1 1 1 1 a:5 1 5 1 1 1 1 1 a:6 1 1 1 1 1 1 1
Notice if player b is ahead player a tends for all-in bets (betting all of their remaining points at once). If player a is ahead they tend to (with a few exceptions) settle back to small 1-point bets. Also look at the diagonal a=b. For small states (a <= 3) the a-player still makes large bets, but for larger values they go back to conservative strategies. The value for player-a starting in each state (the odds of a winning) are given in the table below.
aValue b:0 b:1 b:2 b:3 b:4 b:5 b:6 a:0 1.000 1.000 1.000 1.000 1.000 1.000 1.000a:1 0.000 0.833 0.833 0.833 0.833 0.833 0.833a:2 0.000 0.667 0.667 0.667 0.667 0.694 0.718a:3 0.000 0.500 0.500 0.500 0.556 0.602 0.621a:4 0.000 0.333 0.333 0.417 0.486 0.525 0.557a:5 0.000 0.167 0.278 0.370 0.428 0.476 0.503a:6 0.000 0.139 0.255 0.332 0.396 0.436 0.470
Of particular interest are the entries on the diagonal that are less than 1/2. These are states where even though the state is symmetric (a=b) moving first is a disadvantage (you would rather pass or make a bet of 0 points if that were allowed). This is an effect of the fact that you are more likely to bust than get rid of all of your points for large states. So you would prefer to win not by getting rid of your points in these cases but by waiting for your opponent to bust. This is not the case if you are using the rule that you bust only when the die is smaller than your bet (in this case you do always want to move first).
Notice we can really only make sense of these strategy tables if we have intuition from the even simpler one player game available to guide us. Dynamic programming arguments always seem pretty ad-hoc, but the method is usually so powerful that with some tinkering you will get results and insights. These kind of toy problems are good introductions to dynamic programming, game theory and planning under uncertainty (all of which have fantastic real world applications and are a welcome break from machine learning and data science concerns).
Python 2.7.x code to solve for optimal player a and player b betting strategies. You can experiment with different games sizes (by altering n
() and different loss conditions (by changing bustFn
to lambda m: min(1.0,(m-1)/float(n))
).
# Dynamic programming solution to the following simple game (inspired# by the end-game of "Unspeakable Words", 2007). Game alternates# between player a and player b making moves (starts with player a).# Game state is the pair of non-negative integers (a,b) representing# the a-count and b-count (a,b <= n). Winner the is first player to# take their own count down to zero. When it is the players move they# "bet" an integer quantity "move" from 1 to their current count.# With probability move/n the player "busts" and loses, with# probability 1-move/n the player lowers their count by move points.import numpyimport mathn = 6 # maximum starting count, passed to all functionsbustFn = lambda m: min(1.0,m/float(n)) # probability of losing a m-point bet m>=1# caches for best move and position value.# a*[a,b] represents the state when it is a's move with# a-counts remaining for a and b-counts remaining for b.# b*[a,b] represent similar state when it is b's move.# Player a winning is encoded as 1, player b winning is# encoded as 0. Move tables contain the optimal# move, value tables contain the optimal probability# of a winning (a wants this high, b wants this low).aMove = numpy.zeros((n+1,n+1)) + float('nan')aValue = numpy.zeros((n+1,n+1)) + float('nan')bMove = numpy.zeros((n+1,n+1)) + float('nan')bValue = numpy.zeros((n+1,n+1)) + float('nan')# compute the value and best move for player# a to move in state (a,b).# caching converts recursion to a dynamic program.def aVal(a,b): if not math.isnan(aValue[a,b]): return aValue[a,b] bestMove = 0 besbValue = float('nan') if a<=0: besbValue = 1 else: for move in range(1,a+1): pBust = bustFn(move) # from player a's point of view a-busting # has value 0 and not busting has value of # the reached smaller state on b's move. # a is picking moves to maximize. value = (1-pBust)*bVal(a-move,b) if math.isnan(besbValue) or value>besbValue: bestMove = move besbValue = value aMove[a,b] = bestMove aValue[a,b] = besbValue return besbValue# compute the value and best move for player# b to move in state (a,b).# caching converts recursion to a dynamic program.def bVal(a,b): if not math.isnan(bValue[a,b]): return bValue[a,b] bestMove = 0 besbValue = float('nan') if b<=0: besbValue = 0 else: for move in range(1,b+1): pBust = bustFn(move) # from player a's point of view b-busting # has value b and not busting has value of # the reached smaller state on a's move. # b is picking moves to minimize. value = pBust + (1-pBust)*aVal(a,b-move) if math.isnan(besbValue) or value<besbValue: bestMove = move besbValue = value bMove[a,b] = bestMove bValue[a,b] = besbValue return besbValue# force calculation of all entries of tablesfor i in range(0,n+1): aVal(i,n) aVal(n,i) bVal(i,n) bVal(n,i)# fixed format print matrix m with string format sf and number format nfdef printMat(m,sf,nf): print sf.format(''),' '.join([sf.format('b:'+ str(j)) for j in range(0,n+1)]) for i in range(0,n+1): ui = m[i] print sf.format('a:'+str(i)),' '.join([sf.format(nf.format(uij)) for uij in ui])print 'aValue'printMat(aValue,'{0: <5}','{0:.3f}')print 'aMove'printMat(aMove,'{0: <5}','{0:.0g}')