\n",
"\n",
"# War ([What is it Good For?](https://www.youtube.com/watch?v=bX7V6FAoTLc))\n",
"\n",
"The [538 Riddler Classic for 28 August 2020](https://fivethirtyeight.com/features/can-you-cover-the-globe/) asks about the chance of winning the children's [card game **War**](https://en.wikipedia.org/wiki/War_%28card_game%29) in a **sweep**: a game where player **A** wins 26 turns in a row against player **B**. (In **War**, players are dealt 26 cards each and on each turn they flip the top card in their respective hands; higher rank card wins. Other rules are not relevant to this problem.)\n",
"\n",
"We'll analyze this problem and come up with a program to solve it. First let's get the imports out of the way:"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import random\n",
"import collections\n",
"from itertools import permutations, combinations\n",
"from statistics import mean"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Here are the four approaches I considered:\n",
"\n",
"# Approach 1: Simple Arithmetic?\n",
"\n",
"There is no strategy in **War**, so on every turn, both players have an equal chance of winning the turn. If player **A** wins each turn with probability 1/2, than the probability of sweeping would be $(1/2)^{26}$ or about 15 in a billion. But that's **wrong** because there is also the possibility of a **tie**. After **A** picks a card, there are 51 cards remaining, and 3 have the same rank as **A**'s card, so the possible outcomes for the first turn are: \n",
"\n",
" tie: 3/51\n",
" A wins: 24/51\n",
" B wins: 24/51\n",
" \n",
"If every subsequent turn were the same, the probability that **A** sweeps would be $(24/51)^{26}$ or about 3 in a billion. But that's not quite right because the outcome of each turn depends on the cards picked in the previous turns. So simple arithmetic isn't enough."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Approach 2: Brute Force Enumeration?\n",
"\n",
"Brute force enumeration means:\n",
"- Consider every possible permutation of the deck of cards.\n",
"- For each permutation, deal out the cards and see whether or not **A** sweeps.\n",
"- The probability that **A** sweeps is the number of sweeps divided by the number of permutations.\n",
"\n",
"Easy-peasy; here's the code:"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"Probability = float # Type: Probability is a number between 0.0 and 1.0\n",
"Deck = list # Type: Deck is a list of card ranks\n",
"\n",
"def make_deck(ranks=13, suits=4) -> Deck: \n",
" \"\"\"A Deck is a list of ranks, each rank repeated `suits` times.\"\"\"\n",
" return [r + 1 for r in range(ranks) for _ in range(suits)] \n",
"\n",
"def sweep(deck) -> bool: \n",
" \"\"\"Upon dealing this deck, does player A win every turn?\"\"\"\n",
" return all(deck[i] > deck[i + 1] for i in range(0, len(deck), 2))\n",
"\n",
"def p_sweeps(decks) -> Probability:\n",
" \"\"\"The probability that A sweeps, averaged over the decks.\"\"\"\n",
" return mean(sweep(deck) for deck in decks)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"(*Implementation details:* in Python `False` is equivalent to `0` and `True` is equivalent to `1`, so the `mean` of an iterable of `bool` values is the same as the proportion (or probability) of `True` values. Also, it just felt wrong to have a card with rank `0`, thus I add 1 so that ranks start at `1`.)\n",
"\n",
"Here are two 8-card decks:"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[1, 2, 3, 4, 5, 6, 7, 8]"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"make_deck(8, 1)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[1, 1, 2, 2, 3, 3, 4, 4]"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"make_deck(4, 2)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"And here are the probabilities of sweeping a game played with those decks:"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0.0625"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"p_sweeps(permutations(make_deck(8, 1)))"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0.03571428571428571"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"p_sweeps(permutations(make_deck(4, 2)))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"What about the real deck, with 52 cards? Unfortunately, there are 52! permutations (more than $10^{67}$), and even if we were clever about the duplicated ranks and the ordering of the 26 turns, and\n",
"even if we could process a billion deals a second, it would still take [millions of years](https://www.google.com/search?q=%2852%21+%2F+4%21%5E13+%2F+26%21%29+nanoseconds+in+years&oq=%2852%21+%2F+4%21%5E13+%2F+26%21%29+nanoseconds+in+years) to complete the brute force enumeration. And 538 Riddler wanted the answer by Monday.\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Approach 3: Simulation?\n",
"\n",
"It takes too long to look at **all** the permutations, but we can **randomly sample** the permutations. We call that a **simulation**:"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"def sample(deck, N) -> Deck: \n",
" \"\"\"Yield N randomly shuffled deals of deck.\"\"\"\n",
" for _ in range(N):\n",
" random.shuffle(deck)\n",
" yield deck"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"p_sweeps(sample(make_deck(), 10_000))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Sampling 10,000 deals with the full 52-card deck we got zero sweeps. Approach 1 suggests the probability of a sweep is somewhere near 3 in a billion, which means we would need to sample trillions of deals to get a reliable estimate of the probability. That would require roughly a day of run time: much better than millions of years, but still not good enough."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Approach 4: Abstract Incremental Enumeration!\n",
"\n",
"\n",
"\n",
"As discussed in my [**How To Count Things**](How%20To%20Count%20Things.ipynb) notebook, in problems where brute force enumeration is not feasible it is often possible to consider every outcome if we use a representation that is:\n",
"- **Abstract**: What matters is not the exact rank of every card in the deck, but rather whether **A**'s next card is higher, lower, or the same as **B**'s next card. The representation should tell us enough to compute the odds of those three outcomes, and no more.\n",
"- **Incremental**: First we'll consider the possibilities for the two cards in the first turn, and only for the outcomes in which **A** wins will we then move on to consider possible cards for the next turn. For outcomes in which **A** loses or ties, there is no need to consider the permutations of the remaining cards.\n",
"\n",
"For example, if there are two cards remaining then there are 52 × 51 possible concrete decks, but only two abstract cases: \n",
"* The two cards are **the same** rank, in which case the probability of **A** winning is 0, because it will be a tie. It doesn't matter whether the deck is `[5, 5]` or `[9, 9]`, or any other pair of cards. \n",
"* The two cards are **different** ranks, in which case the probability of **A** winning is 1/2. It doesn't matter whether the deck is `[10, 8]` or `[2, 5]` or any other two distinct ranks.\n",
" \n",
"# Abstract Deck\n",
"\n",
"The representation I will use, which I call `AbDeck` for **abstract deck**, is as follows:\n",
"- An `AbDeck` is a tuple of counts of the number of cards of each rank, without specifying the rank. \n",
"- Ranks with counts of zero are dropped from the representation.\n",
"- The counts are put in sorted order.\n",
"- The tuple represents the deck without regard to the order of the cards (that is, not a specific permutation).\n",
"- The tuple tells you just what you need (no more, no less) to compute the probability of a sweep.\n",
"- The initial 52-card deck is 13 fours: `(4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4)`\n",
"- The possible two-card decks are `(1, 1)` and `(2,)`.\n",
"- The possible four-card decks are `(1, 1, 1, 1)`, `(1, 1, 2)`, `(1, 3)`, `(2, 2)`, and `(4,)`.\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4)"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"AbDeck = tuple # Type: An abstract deck\n",
"\n",
"def make_abdeck(ranks=13, suits=4) -> AbDeck: \n",
" \"\"\"Make an abstract deck.\"\"\"\n",
" return AbDeck([suits] * ranks)\n",
"\n",
"deck = make_abdeck(13, 4) # The initial 52-card abstract deck\n",
"deck"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
">***Note:*** I originally thought of *two* possible abstract representations for decks: `AbDeck` as described above, and an alternative in which: \n",
"- A deck was a tuple where `deck[i]` gives the number of different ranks that have exactly `i` cards remaining in the deck.\n",
"- The initial deck is represented as `(0, 0, 0, 0, 13)`: 13 ranks with 4-of-a-kind.\n",
"- The two possible two-card decks are `(0, 2)` (two ranks with 1 card each) and `(0, 0, 1)` (one rank with two cards).\n",
">\n",
">That representation was more compact, and ran faster; I used it in the first version of this notebook. But the very nice [implementation](https://laurentlessard.com/bookproofs/flawless-war/) shared by [Laurent Lessard](https://laurentlessard.com/) convinced me to switch to the `AbDeck` representation becausse it is simpler to code and understand.\n",
"\n",
"# Probability Distribution\n",
"\n",
"We want to keep track of the possible outcomes of each turn. We'll do that with the help of a class called `PDist` (for **probability distribution**), a mapping of `{deck: p, ...}` where `deck` is an abstract deck that is the outcome after some number of turns, each of which **A** has won, and `p` is the probability of that outcome."
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [],
"source": [
"class PDist(collections.Counter): \n",
" \"\"\"A probability distribution of {abstract_deck: probability}.\"\"\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Abstract Incremental Enumeration Strategy\n",
"\n",
"The probability that **A** sweeps a game of **War** given an abstract deck, `p_sweep(deck)`, is computed as follows:\n",
"\n",
"- Start with `P` being a **probability distribution** of outcomes after 0 turns: `{deck: 1.0}`.\n",
"- **for** each turn of the game (26 turns for a 52-card deck):\n",
" - Update `P` to be the probability distribution that results from playing a turn, `P = play_turn(P)`:\n",
" - **for** each `deck` in `P`:\n",
" - **for** each way of removing two cards of rank `i` and rank `j` from `deck` to yield a resulting deck:\n",
" - Increment the probability of the resulting deck by (probability of `deck` × probability of `i` × probability of `j`).\n",
"- **return** the sum of the probabilities of the winning decks in `P` from the final turn\n",
"\n",
"Note that (for example) `combinations(range(3), 2)` yields `(1, 2)`, but not `(1, 1)` nor `(2, 1)`. That's just what we want: if the two selected cards are `i, j = (1, 1)` then the result will be a tie, which we don't want to put in the probability distribution of decks where **A** sweeps. And out of the two possibilities `i, j = (1, 2)` and `i, j = (2, 1)`, exactly one will result in a win for **A**, so we only want to include one of them."
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [],
"source": [
"def p_sweep(deck: AbDeck) -> Probability:\n",
" \"\"\"The probability that player A sweeps a game of War.\"\"\"\n",
" P = PDist({deck: 1.0}) # The initial probability distribution\n",
" for turn in range(sum(deck) // 2):\n",
" P = play_turn(P)\n",
" return sum(P.values())\n",
"\n",
"def play_turn(P) -> PDist:\n",
" \"\"\"Play one turn with all possible card choices for players A and B; return\n",
" the probability distribution of outcomes where A still might sweep.\"\"\"\n",
" P1 = PDist() # The probability distribution of {deck: prob} after this turn\n",
" for deck in P:\n",
" n = sum(deck) # number of cards in deck\n",
" for i, j in combinations(range(len(deck)), 2):\n",
" P1[remove(deck, i, j)] += P[deck] * deck[i] / n * deck[j] / (n - 1)\n",
" return P1\n",
"\n",
"def remove(deck, *indexes) -> AbDeck:\n",
" \"\"\"Remove cards at indexes from deck.\"\"\"\n",
" counts = [c - indexes.count(i) for i, c in enumerate(deck)]\n",
" return AbDeck(sorted(c for c in counts if c > 0))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# The Answer!\n",
"\n",
"What's the probability that player **A** will win in a sweep?"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"3.132436174322299e-09"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"p_sweep(deck)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"A little over 3 in a billion. \n",
"\n",
"(By the way, this computation took about 0.2 seconds, a big improvement over millions of years.) \n",
"\n",
"That's the answer to *my* question about the probability of **A** sweeping, but 538 Riddler actually posed the question somewhat differently: \"*How many games of War would you expect to play until you had a game that lasted just 26 turns with no wars, like Duane’s friend’s granddaughter?*\" That is, they want to know the inverse of the probability, and they are allowing for either **A** or **B** to sweep. So that would be:"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"159620171.70491105"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"1 / (2 * p_sweep(deck))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
" We would expect a sweep about once every 160 million games. I have to say, I'm begining to doubt Duane’s friend’s granddaughter's claim. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Working through the algorithm\n",
"\n",
"Let's work through how `p_sweep` and `play_turn` work. We'll start with `P` being the initial probability distribution, and then play four turns, each time displaying how `P` is updated:"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"PDist({(4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4): 1.0})"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"P = PDist({deck: 1.0})\n",
"P"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"PDist({(3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4): 0.47058823529411703})"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"P = play_turn(P) # Turn 1\n",
"P"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"That makes sense: the only result where **A** still has a chance to sweep is when the two players select different ranks, giving us two ranks with three-of-a-kind remaining (and 11 with four-of-a-kind remaining). The probability of **A** sweeping one turn is 24/51 = 0.47058823529411703."
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"PDist({(2, 2, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4): 0.001728691476590634,\n",
" (2, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4): 0.050708283313325254,\n",
" (3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4): 0.16902761104441777})"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"P = play_turn(P) # Turn 2\n",
"P"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"PDist({(1, 1, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4): 3.065055809557862e-06,\n",
" (1, 2, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4): 0.00040458736686163773,\n",
" (2, 2, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4): 0.010114684171540949,\n",
" (1, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4): 0.0017981660749406122,\n",
" (2, 2, 2, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4): 0.00020229368343081884,\n",
" (2, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4): 0.04855048402339655,\n",
" (3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4): 0.043155985798574735})"
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"P = play_turn(P) # Turn 3\n",
"P"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"PDist({(4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4): 1.4807032896414793e-09,\n",
" (1, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4): 5.212075579538007e-07,\n",
" (1, 1, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4): 3.648452905676605e-05,\n",
" (2, 2, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4): 5.863585026980257e-07,\n",
" (2, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4): 1.563622673861402e-05,\n",
" (1, 1, 2, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4): 2.3454340107921024e-06,\n",
" (1, 2, 2, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4): 0.00018763472086336856,\n",
" (1, 2, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4): 0.0016887124877703135,\n",
" (2, 2, 2, 2, 4, 4, 4, 4, 4, 4, 4, 4, 4): 4.397688770235195e-05,\n",
" (2, 2, 2, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4): 0.0023923426910079496,\n",
" (2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4): 0.014635508227342713,\n",
" (3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4): 3.127245347722804e-05,\n",
" (1, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4): 0.002001437022542595,\n",
" (2, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4): 0.02101508873669728,\n",
" (3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4): 0.007005029578899085})"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"P = play_turn(P) # Turn 4\n",
"P"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We'll leave it as an exercise for the reader to verify that these turns are correct."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Gaining Confidence in the Answer\n",
"\n",
"One way to gain confidence: The answer should be somewhere close to the arithmetic calculation of $(24/51)^{26}$."
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"1.0167508045532485"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"p_sweep(deck) / ((24/51) ** 26)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We see that it is indeed close, differing by less than 2%.\n",
"\n",
"Another way to gain confidence: We can verify that there is no difference (∆ = 0) between the brute force `p_sweeps` and the abstract incremental `p_sweep` for small decks. This gives us some confidence that both are correct (or maybe both have similar errors). `p_sweeps` takes less than 50 milliseconds to solve decks of 8 cards or less, but takes about 4 seconds to do the 10! (almost 4 million) permutations of ten card decks."
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"deck(2, 1): p = 0.5000; ∆ = 0.000000000\n",
"deck(4, 1): p = 0.2500; ∆ = 0.000000000\n",
"deck(2, 2): p = 0.1667; ∆ = 0.000000000\n",
"deck(6, 1): p = 0.1250; ∆ = 0.000000000\n",
"deck(3, 2): p = 0.0667; ∆ = 0.000000000\n",
"deck(2, 3): p = 0.0500; ∆ = 0.000000000\n",
"deck(8, 1): p = 0.0625; ∆ = 0.000000000\n",
"deck(4, 2): p = 0.0357; ∆ = 0.000000000\n",
"deck(2, 4): p = 0.0143; ∆ = 0.000000000\n",
"deck(5, 2): p = 0.0180; ∆ = 0.000000000\n",
"deck(2, 5): p = 0.0040; ∆ = 0.000000000\n"
]
}
],
"source": [
"for r, s in [(2, 1), \n",
" (4, 1), (2, 2), \n",
" (6, 1), (3, 2), (2, 3),\n",
" (8, 1), (4, 2), (2, 4), \n",
" (5, 2), (2, 5)]:\n",
" p1 = p_sweeps(permutations(make_deck(r, s)))\n",
" p2 = p_sweep(make_abdeck(r, s))\n",
" print(f'deck({r}, {s}): p = {p1:.4f}; ∆ = {abs(p1 - p2):.9f}')"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.7"
}
},
"nbformat": 4,
"nbformat_minor": 4
}