{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "button": false,
    "deletable": true,
    "new_sheet": false,
    "run_control": {
     "read_only": false
    }
   },
   "source": [
    "<div style=\"text-align: right\" align=\"right\"><i>Peter Norvig<br>2014, 2023</i></div>\n",
    "\n",
    "# Sicherman Dice\n",
    "\n",
    "> *Note: This notebook takes the form of a conversation between two friends, Ali and Bo, who like to solve problems.*\n",
    "\n",
    "Ali: Huh. [**This**](http://wordplay.blogs.nytimes.com/2014/06/16/dice-3) is interesting.  You know how, in a board game like Monopoly, you roll two dice, add them up, and then move that number of spaces? \n",
    "\n",
    "**Bo: Right.**\n",
    "\n",
    "Ali: And a sum like 2 can only be made one way, 1 + 1, but 9 can be made multiple ways. \n",
    "\n",
    "**Bo: Yeah. 9 can be made 4 ways: 3 + 6, 4 + 5, 5 + 4, or 6 + 3.**\n",
    "\n",
    "Ali: The interesting thing is that people have been playing dice games for 7,000 years. But it wasn't until 1977 that <a href=\"http://userpages.monmouth.com/~colonel/\">Colonel George Sicherman</a> asked whether is is possible to have a pair of dice that are not \"standard\" dice (standard dice have [1, 2, 3, 4, 5, 6] on their six sides) but have the same **distribution of sums** as a standard pair–so the pair of Sicherman dice would also have one way of making a 2 and four ways of making 9, but it could be different ways; maybe 8+1 could be one way. Like standard dice, each of the sides on a Sicherman die is a positive integer, but some of them could be greater than 6.\n",
    "\n",
    "**Bo: And what did Sicherman find?**\n",
    "\n",
    "Ali: Wouldn't it be more fun to figure it out for ourselves?\n",
    "\n",
    "**Bo: OK!**\n",
    "\n",
    "Ali: How could we proceed?\n",
    "\n",
    "**Bo: When in doubt, [use brute force](http://quotes.lifehack.org/quote/ken-thompson/when-in-doubt-use-brute-force/): we can write a program to enumerate and check the possibilities:**\n",
    "\n",
    "\n",
    "- **Generate all dice that could possibly be part of a solution, such as [1, 2, 2, 4, 8, 9].**\n",
    "- **Consider all pairs of these dice, such as ([1, 2, 2, 4, 8, 9], [1, 3, 4, 4, 5, 8])**\n",
    "- **Check each pair to see if it has the same distribution of sums as the standard pair (but does not use standard dice).**\n",
    "\n",
    "Ali: How should we represent the distribution of sums?\n",
    "\n",
    "**Bo: A pair of 6-sided dice makes 36 different sums; we want a representation that will be the same for two different pairs of dice if all 36 sums are the same, but where the order of the sums doesn't matter. We could just have a sorted list of the sums, but it would be more compact to have a `Counter` of the sums.**\n",
    "\n",
    "Ali: Good. First I'll write down some data types:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "button": false,
    "collapsed": false,
    "deletable": true,
    "jupyter": {
     "outputs_hidden": false
    },
    "new_sheet": false,
    "run_control": {
     "read_only": false
    }
   },
   "outputs": [],
   "source": [
    "from typing import *\n",
    "\n",
    "Side = int     # Type for a side of a die: a positive integer, e.g. 3\n",
    "Die  = list    # Type for a die: a list of numbers on sides, e.g. [1, 2, 2, 4, 8, 9]\n",
    "Pair = tuple   # Type for a pair of dice, e.g. ([1, 3, 4, 4, 5, 8], [1, 2, 2, 4, 8, 9])\n",
    "Dist = Counter # Type for the distribution of sums of a pair of dice, e.g. {2: 1, 3: 2, ... 12: 1}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Ali: Now I can code up your description almost verbatim. I'll also keep track of our TO DO list:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "button": false,
    "collapsed": false,
    "deletable": true,
    "jupyter": {
     "outputs_hidden": false
    },
    "new_sheet": false,
    "run_control": {
     "read_only": false
    }
   },
   "outputs": [],
   "source": [
    "def sicherman() -> List[Pair]:\n",
    "    \"\"\"A list of nonstandard pairs of dice that have the same\n",
    "    distribution of sums as a standard pair of dice.\"\"\"\n",
    "    return [pair for pair in all_pairs(all_dice())\n",
    "            if distribution_of_sums(pair) == standard_dist\n",
    "            and pair != standard_pair]\n",
    "\n",
    "def all_pairs(dice) -> List[Pair]: \n",
    "    \"\"\"Return all dice pairs (A, B)  where A <= B (so, not also (B, A)).\"\"\"\n",
    "    return [(A, B) for A in dice for B in dice if A <= B]\n",
    "\n",
    "# TODO: all_dice(), distribution_of_sums(pair), standard_dist, standard_pair"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "button": false,
    "deletable": true,
    "new_sheet": false,
    "run_control": {
     "read_only": false
    }
   },
   "source": [
    "**Bo: Looks good to me. We should test `all_pairs` to make sure it works:**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "button": false,
    "collapsed": false,
    "deletable": true,
    "jupyter": {
     "outputs_hidden": false
    },
    "new_sheet": false,
    "run_control": {
     "read_only": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[('a', 'a'), ('a', 'b'), ('a', 'c'), ('b', 'b'), ('b', 'c'), ('c', 'c')]"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "all_pairs(['a', 'b', 'c'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "button": false,
    "deletable": true,
    "new_sheet": false,
    "run_control": {
     "read_only": false
    }
   },
   "source": [
    "# Standard Dice and Distributions\n",
    "\n",
    "Ali: Now we need `distribution_of_sums(pair)`, and definitions for standard die, pair, and distribution. That should be easy. Dice start at one, not zero, so I can't use `range(6)` for the possible sides of a die; instead I decided to define `ints` and use `ints(1, 6)`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "button": false,
    "collapsed": false,
    "deletable": true,
    "jupyter": {
     "outputs_hidden": false
    },
    "new_sheet": false,
    "run_control": {
     "read_only": false
    }
   },
   "outputs": [],
   "source": [
    "def distribution_of_sums(pair) -> Dist:\n",
    "    \"All possible sums of a side from one die plus a side from the other.\"\n",
    "    return Counter(a + b for a in pair[0] for b in pair[1])\n",
    "\n",
    "def ints(start, end) -> range: \n",
    "    \"The integers from start to end, inclusive.\"\n",
    "    return range(start, end + 1)\n",
    "\n",
    "standard_die  = Die(ints(1, 6))\n",
    "standard_pair = (standard_die, standard_die)\n",
    "standard_dist = distribution_of_sums(standard_pair)\n",
    "\n",
    "# TODO: all_dice()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "button": false,
    "deletable": true,
    "new_sheet": false,
    "run_control": {
     "read_only": false
    }
   },
   "source": [
    "**Bo: Let's look at what we got:**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[1, 2, 3, 4, 5, 6]"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "standard_die"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "([1, 2, 3, 4, 5, 6], [1, 2, 3, 4, 5, 6])"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "standard_pair"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "button": false,
    "collapsed": false,
    "deletable": true,
    "jupyter": {
     "outputs_hidden": false
    },
    "new_sheet": false,
    "run_control": {
     "read_only": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Counter({2: 1, 3: 2, 4: 3, 5: 4, 6: 5, 7: 6, 8: 5, 9: 4, 10: 3, 11: 2, 12: 1})"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "standard_dist"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "button": false,
    "deletable": true,
    "new_sheet": false,
    "run_control": {
     "read_only": false
    }
   },
   "source": [
    "Ali: Looks good! Now only one more thing on our TO DO list:\n",
    "\n",
    "# All Possible Dice\n",
    "\n",
    "\n",
    "Ali: `all_dice()` should generate all possible dice, where by \"possible\" I mean the dice that could feasibly be part of a pair that is a solution to the Sicherman problem. Do we know how many dice that will be? Is it a big enough number that efficiency will be a problem?\n",
    "\n",
    "**Bo: Let's see.  A die has six sides. If each side can be a number from, say, 1 to 10, that's 10<sup>6</sup> or a million possible dice; a million is a small number for a computer.**\n",
    "\n",
    "Ali: True, a million is a small number. But how many pairs of dice will there be?\n",
    "\n",
    "**Bo: Ah. That's right. A million squared is a trillion.  That's a big number even for a computer. An empty loop that just counts to a million takes milliseconds in Python, but counting to a *trillion* takes hours. Checking a trillion pairs will take days.**\n",
    "\n",
    "Ali: So we really need to pare down the number of possible dice. What about permutations?\n",
    "\n",
    "**Bo: Good point!  If I have the die [1, 2, 3, 4, 5, 7], then I don't need the different permutations–that is, dice like [2, 4, 7, 1, 3, 5]. We can arrange to only generate dice whose sides are in sorted order.**\n",
    "\n",
    "Ali: Any other constraints on the sides?\n",
    "\n",
    "**Bo: Here's something: the distribution of sums for a standard pair has exactly one 2. The only way to get a sum of 2 from two positive integers is 1+1. So that means that each die must have exactly one 1; no more, no less.**\n",
    "\n",
    "Ali: You said the sides can range from 1 to 10.  Are you *sure* that 11 can't be part of a legal solution? \n",
    "\n",
    "\n",
    "**Bo: I was just guessing, but let's check. The distribution of sums for a pair has exactly one 12, and nothing higher. Assume one die had an 11 on some side. We know the other die can only have one 1, so it must have some sides that are greater than 1, so the pair would make some sums that are greater than 12. That's wrong, so that means the assumption must be wrong: it can't be true that a  die has an 11.**\n",
    "\n",
    "Ali: So all together the smallest number must be 1, the next smallest is at least 2, and all the numbers are  10 or less.  So:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "button": false,
    "collapsed": false,
    "deletable": true,
    "jupyter": {
     "outputs_hidden": false
    },
    "new_sheet": false,
    "run_control": {
     "read_only": false
    }
   },
   "outputs": [],
   "source": [
    "def all_dice() -> List[Die]:\n",
    "    \"\"\"A list of all feasible 6-sided dice for the Sicherman problem.\"\"\"\n",
    "    return [[1, b, c, d, e, f]\n",
    "            for b in ints(2, 10) \n",
    "            for c in ints(b, 10)\n",
    "            for d in ints(c, 10)\n",
    "            for e in ints(d, 10) \n",
    "            for f in ints(e, 10)]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Ali: Let's see how many dice there are:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "1287"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "len(all_dice())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Ali: Nice! We got down from a million to 1287 different dice, which means the number of pairs is reduced from a trillion to just over a million! I don't want to look at all 1287 dice, but here's every hundredth die:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[[1, 2, 2, 2, 2, 2],\n",
       " [1, 2, 2, 4, 7, 8],\n",
       " [1, 2, 3, 3, 10, 10],\n",
       " [1, 2, 4, 4, 6, 8],\n",
       " [1, 2, 5, 6, 8, 9],\n",
       " [1, 3, 3, 3, 3, 8],\n",
       " [1, 3, 3, 7, 8, 9],\n",
       " [1, 3, 5, 5, 5, 6],\n",
       " [1, 3, 7, 8, 8, 8],\n",
       " [1, 4, 4, 8, 8, 9],\n",
       " [1, 4, 7, 7, 7, 7],\n",
       " [1, 5, 6, 6, 8, 8],\n",
       " [1, 6, 7, 7, 8, 8]]"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "all_dice()[::100]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "button": false,
    "deletable": true,
    "new_sheet": false,
    "run_control": {
     "read_only": false
    }
   },
   "source": [
    "Ali: I think we're ready to run `sicherman()`.  Any bets on what we'll find out?\n",
    "\n",
    "**Bo: I bet that Sicherman is remembered because he discovered a pair of dice that works. If he just proved the non-existence of a pair, I don't think there would be an article about him.**\n",
    "\n",
    "Ali: Makes sense.  Here goes:\n",
    "\n",
    "# The Answer"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "button": false,
    "collapsed": false,
    "deletable": true,
    "jupyter": {
     "outputs_hidden": false
    },
    "new_sheet": false,
    "run_control": {
     "read_only": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[([1, 2, 2, 3, 3, 4], [1, 3, 4, 5, 6, 8])]"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sicherman()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "button": false,
    "deletable": true,
    "new_sheet": false,
    "run_control": {
     "read_only": false
    }
   },
   "source": [
    "**Bo: Look at that!**\n",
    "\n",
    "Ali: It turns out you can <a href=\"https://www.mathartfun.com/dSpecial.html\">buy</a> a pair of dice with just these numbers, and there are some nice [charts and tables on Wikipedia](https://en.wikipedia.org/wiki/Sicherman_dice) explaining it all.  \n",
    "\n",
    "![Sicherman dice from MathArtFun.com](http://www.mathartfun.com/shopsite_sc/store/html/SichermanDots.jpg)\n",
    "\n",
    "\n",
    "\n",
    "**Bo: We could stop here. Or ... what if we considered dice that don't have 6 sides?**\n",
    "\n",
    "# *N*-Sided Dice\n",
    "\n",
    "![A set of isohedral dice](https://digitalambler.files.wordpress.com/2013/05/dice.jpg)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "button": false,
    "deletable": true,
    "new_sheet": false,
    "run_control": {
     "read_only": false
    }
   },
   "source": [
    "Ali: I know 4-, 8–, 12-, and 20-sided dice are common. Let's try to handle *N*-sided dice for as large an *N* as we can.. \n",
    "\n",
    "**Bo: Well, the number of dice pairs is going to be *O*(*N*<sup>2*N*</sup>), because when you add an *N*-th side to a die it can have (almost) *N* possible values, and then the number of pairs is proportional to the square of the number of dice. So my guess is we won't get up to 20 before our program becomes too slow. But we can go about parameterizing `sicherman` and the other pieces for now, and worry about efficiency later:**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "button": false,
    "collapsed": false,
    "deletable": true,
    "jupyter": {
     "outputs_hidden": false
    },
    "new_sheet": false,
    "run_control": {
     "read_only": false
    }
   },
   "outputs": [],
   "source": [
    "def sicherman(N=6) -> List[Pair]:\n",
    "    \"\"\"A list of pairs of N-sided dice that have the same\n",
    "    distribution of sums as a standard pair of N-sided dice.\"\"\"\n",
    "    standard_pair = nth_standard_pair(N)\n",
    "    standard_dist = distribution_of_sums(standard_pair)\n",
    "    return [pair for pair in all_pairs(all_dice(N))\n",
    "            if distribution_of_sums(pair) == standard_dist\n",
    "            and pair != standard_pair]\n",
    "\n",
    "def nth_standard_pair(N) -> Die: return (Die(ints(1, N)), Die(ints(1, N)))\n",
    "\n",
    "# TODO: all_dice(N)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Ali:  Now, for `all_dice(N)`, I think most of the analysis from `N=6` still holds: \n",
    "- The first side must be 1.\n",
    "- Subsequent sides must be at least 2.\n",
    "- No side can be more than `2 * N - 2`\n",
    "- To avoid permutations, the sides must be in sorted order.\n",
    "\n",
    "For the old version of `all_dice()`, we had five nested loops for the five sides after the first. But the number of nested loops in a program is detemined at compile time; I can't write a program that has `N - 1` nested loops. We'll have to find some other implementation technique.\n",
    "\n",
    "**Bo: Here's an iterative approach: we keep track of a list of partially-formed dice, and on each iteration, we add a side to all the partially-formed dice in all possible ways, until the dice all have `N` sides. Our list starts with just one partially-formed die, with a single side containing a 1:**\n",
    "\n",
    "    dice = [ [1] ]\n",
    "    \n",
    "**Suppose `N = 4`. Then the biggest number on any side is 6, and we get these partially-formed dice with 2 sides:**\n",
    "\n",
    "    dice = [[1, 2], [1, 3], [1, 4], [1, 5], [1, 6]\n",
    "   \n",
    "**To extend to 3 sides we would add a number to each of those five in all possible ways, giving:**\n",
    "\n",
    "    dice = [[1, 2, 2], [1, 2, 3], [1, 2, 4], [1, 2, 5], [1, 2, 6],\n",
    "            [1, 3, 3], [1, 3, 4], [1, 3, 5], [1, 3, 6], \n",
    "            [1, 4, 4], [1, 4, 5], [1, 4, 6],\n",
    "            [1, 5, 5], [1, 5, 6],\n",
    "            [1, 6, 6]]\n",
    "            \n",
    "**The final iteration would add one more side to each of these. Here's how the code looks:**\n",
    "    "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "def all_dice(N=6) -> List[Die]:\n",
    "    \"\"\"All feasible Sicherman dice with N sides.\"\"\"\n",
    "    dice = [ [1] ] # The first side must be 1\n",
    "    for _ in range(N - 1): # Add N - 1 sides, each nondecreasing, between 2 and 2N-2\n",
    "        dice = [die + [side] \n",
    "                for die in dice \n",
    "                for side in ints(max(2, die[-1]), 2 * N - 2)]\n",
    "    return dice\n",
    "\n",
    "assert len(all_dice(6)) == 1287 # Make sure this hasn't changed"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Ali: Let's try `sicherman(N)` for some small values of `N`.\n",
    "\n",
    "# Results  up to *N* = 6"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "CPU times: user 4.1 s, sys: 34.2 ms, total: 4.13 s\n",
      "Wall time: 4.16 s\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "{2: [],\n",
       " 3: [],\n",
       " 4: [([1, 2, 2, 3], [1, 3, 3, 5])],\n",
       " 5: [],\n",
       " 6: [([1, 2, 2, 3, 3, 4], [1, 3, 4, 5, 6, 8])]}"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%time {N: sicherman(N) for N in ints(2, 6)}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Bo: It is certainly reassuring that we get the same result for `sicherman(6)` as with the old version of `sicherman()`. And we learned something new:  there is a result for `sicherman(4)` but not for 2, 3, and 5. That's interesting!** \n",
    "\n",
    "Ali: Maybe there can be no solution for primes? How much farther can we go beyond `N = 6`?\n",
    "\n",
    "**Bo: Let's count how many possible dice there are for various values of `N`:**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{1: 1,\n",
       " 2: 1,\n",
       " 3: 6,\n",
       " 4: 35,\n",
       " 5: 210,\n",
       " 6: 1287,\n",
       " 7: 8008,\n",
       " 8: 50388,\n",
       " 9: 319770,\n",
       " 10: 2042975}"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "{N: len(all_dice(N)) for N in ints(1, 10)}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "button": false,
    "deletable": true,
    "new_sheet": false,
    "run_control": {
     "read_only": false
    }
   },
   "source": [
    "Ali: That doesn't look encouraging. With run time proportional to the square of the number of dice, it looks like the run time will be roughly on the order of a minute for `N=7`, an hour for `N=8`, days for `N=9`, and weeks for `N=10`. \n",
    "\n",
    "# Reducing the Number of Dice\n",
    "\n",
    "Ali: We  need to pare down the number of possible dice! Can we tighten up the constraints on sides? \n",
    "\n",
    "I think it would be helpful to me to have a look at a table of distributions of sums:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "button": false,
    "collapsed": false,
    "deletable": true,
    "jupyter": {
     "outputs_hidden": false
    },
    "new_sheet": false,
    "run_control": {
     "read_only": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{2: {2: 1, 3: 2, 4: 1},\n",
       " 3: {2: 1, 3: 2, 4: 3, 5: 2, 6: 1},\n",
       " 4: {2: 1, 3: 2, 4: 3, 5: 4, 6: 3, 7: 2, 8: 1},\n",
       " 5: {2: 1, 3: 2, 4: 3, 5: 4, 6: 5, 7: 4, 8: 3, 9: 2, 10: 1},\n",
       " 6: {2: 1, 3: 2, 4: 3, 5: 4, 6: 5, 7: 6, 8: 5, 9: 4, 10: 3, 11: 2, 12: 1}}"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "{N: dict(distribution_of_sums(nth_standard_pair(N))) for N in ints(2, 6)}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "button": false,
    "deletable": true,
    "new_sheet": false,
    "run_control": {
     "read_only": false
    }
   },
   "source": [
    "**Bo: You're right! That is helpful! I'm reminded of the very first die generated by the first version of `all_dice()`:**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[1, 2, 2, 2, 2, 2]"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "all_dice()[0]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "button": false,
    "deletable": true,
    "new_sheet": false,
    "run_control": {
     "read_only": false
    }
   },
   "source": [
    "**Bo: I had a feeling that was too many twos, and could never be part of a solution. I can see now that every `nth_standard_dist` distribution has at most one 2, two 3s, three 4s, and so on. How does the distribution relate to the dice? We know every die has a 1. So a die with five 2s, when paired with any other die, would make six 3s. That's too many; we can only have two 3s in a distribution. That means that a die can never have more two 2s, three 3s, four 4s, and so on. It all works because we know the other die has to have a 1.**\n",
    "\n",
    "**I'm going to define a function `lower_bounds(N)` which will produce a list of integers that are the lower bounds for each side of an `N`-sided die. Previously, our lower bounds for a 6-sided die were implicitly specified as [1, 2, 2, 2, 2, 2], but we now know that's not right: we don't want to allow a die with five 2s.**\n",
    "\n",
    "Ali: Before you do that, I thought of something else: the first side is special in that there can only be one copy of it; otherwise we'd get two 2s, and we only want one 2. The last side should be similarly special. We don't know what the last value will be, but we do know that we only want one of them, because if there were more than one, then we'd get more than one of the biggest sum in the distribution of sums, and we know we only want one.\n",
    "\n",
    "**Bo: Very good. That means the lower bound of the last side must be one more than the lower bound of the penultimate side.  Here we go:**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [],
   "source": [
    "def lower_bounds(N):\n",
    "    \"A list of lower bounds for respective sides of an N-sided die.\"\n",
    "    bounds = [1]                 # First side is 1\n",
    "    while len(bounds) < N:\n",
    "        s = bounds[-1] + 1\n",
    "        bounds.extend(s * [s])   # Allow two 2s, three 3s, etc.\n",
    "    bounds = bounds[:N]          # If we added too many s's, drop the extras\n",
    "    bounds[-1] = bounds[-2] + 1  # Last side is one more than second-to-last\n",
    "    return bounds"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[1, 2, 2, 3, 3, 4]"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "lower_bounds(6)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[1, 2, 2, 3, 3, 3, 4, 4, 4, 5]"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "lower_bounds(10)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Ali: Yeah, that looks right.\n",
    "\n",
    "**Bo: And now for the upper bounds.  For, say, *N*=10, the biggest sum in the distribution of sums is 20. One of the lower bounds for *N*=10 is 5, meaning that every 10-sided die will have a 5 or higher. That implies that 15 is an upper bound for every side: no 10-sided die can have a number more than 15, because otherwise there would be a sum over 20.**\n",
    "\n",
    "Ali: So 15 is the upper bound for the final side (and 1 is the upper bound for the first side). And as I just said, we know that whatever the biggest number is, there can only be one of them, so the upper bound for the penultimate number is one less than the final number. What else do we know?\n",
    "\n",
    "**Bo: It gets complicated. It was easier with the lower bounds, because we know every die has a 1. We don't know exactly what the biggest number will be on a die. So I'm happy letting all the upper bounds between the first and last sides be one less than the last:**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [],
   "source": [
    "def upper_bounds(N):\n",
    "    \"A list of upper bounds for respective sides of an N-sided die.\"\n",
    "    biggest = 2 * N - lower_bounds(N)[-1]\n",
    "    middle = [biggest - 1] * (N - 2)\n",
    "    return [1, *middle, biggest]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Bo: I'll make a little thing to display the bounds:**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [],
   "source": [
    "def show_bounds(Ns): \n",
    "    for N in Ns: \n",
    "        print(f'N = {N:2}: ' + \n",
    "              ' | '.join(f'{L}-{U:<2}' for L, U in zip(lower_bounds(N), upper_bounds(N))))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "N =  6: 1-1  | 2-7  | 2-7  | 3-7  | 3-7  | 4-8 \n",
      "N =  8: 1-1  | 2-10 | 2-10 | 3-10 | 3-10 | 3-10 | 4-10 | 5-11\n",
      "N = 10: 1-1  | 2-14 | 2-14 | 3-14 | 3-14 | 3-14 | 4-14 | 4-14 | 4-14 | 5-15\n",
      "N = 12: 1-1  | 2-17 | 2-17 | 3-17 | 3-17 | 3-17 | 4-17 | 4-17 | 4-17 | 4-17 | 5-17 | 6-18\n",
      "N = 13: 1-1  | 2-19 | 2-19 | 3-19 | 3-19 | 3-19 | 4-19 | 4-19 | 4-19 | 4-19 | 5-19 | 5-19 | 6-20\n"
     ]
    }
   ],
   "source": [
    "show_bounds((6, 8, 10, 12, 13))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Bo: I can use the bounds to make a more constrained version of `all_dice(N)`, like this:**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [],
   "source": [
    "def all_dice(N=6) -> List[Die]:\n",
    "    \"\"\"All feasible Sicherman dice with N sides.\"\"\"\n",
    "    uppers = upper_bounds(N)\n",
    "    dice = [ [1] ] # The first side must be 1\n",
    "    for i in range(1, N): # Add N - 1 more sides\n",
    "        dice = [die + [side] \n",
    "                for die in dice \n",
    "                for side in ints(die[-1] + (i == (N - 1)), uppers[i])\n",
    "                if die.count(side) < side]\n",
    "    return dice"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Ali: What's up with `+ (i == (N - 1))`?\n",
    "\n",
    "**Bo: Oh. In Python a Boolean expression is equivalent to 0 or 1, so this says \"the lower bound for the next side is the same as the lower bound for the previous side, `die[-1]`, except that if this is the last side, then the lower bound is one more.\"**\n",
    "\n",
    "Ali: How come you don't even reference `lower_bounds(N)` within `all_dice(N)`? And how come you have the `if die.count(side) < side]` clause? Is this code repetition intentional, or did you forget that `lower_bound(N)` already dealt with having only two 2s, three 3s, etc.?\n",
    "\n",
    "**Bo: It is intentional, but I agree with you that it is confusing and look repetitious. Even though `lower_bounds` is not used in `all_dice`, it is used in `upper_bounds`; without it, the bounds wouldn't be as tight. The difference is that in `lower_bounds(N)` we're answering the question \"what's the lowest possible value for side *s* across every single *N*-sided die?\" In `all_dice(N)`, we're considering a different question: \"for this particular dice, what numbers can come next?\" That's two different things, so we need two pieces of code.**\n",
    "\n",
    "**Suppose that with `N = 6` we have the partial die `[1, 3, 3, 3]`. The lower bound for the next side is 3, and there are many die that have a 3 on the 5th side. But for this particular die we can't allow a fourth 3; that's why `all_dice(N)` needs to check, and can't rely on `lower_bounds(N)` alone.**\n",
    "\n",
    "Ali: I got it. \n",
    "\n",
    "Let's see how much we have reduced the numbers of `all_dice`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{2: 1, 3: 1, 4: 10, 5: 31, 6: 226, 7: 1555, 8: 5728, 9: 39416, 10: 266931}"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "{N: len(all_dice(N)) for N in ints(2, 10)}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Ali: Wow! For `N = 6` we're down from 1287 to 226 dice, and for 10 down from 2 million to 266,931.\n",
    "\n",
    "**Bo: Let's check out some of the `all_dice(6)`; hopefully we got rid of the die with five 2s:**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[[1, 2, 2, 3, 3, 4],\n",
       " [1, 2, 2, 4, 5, 7],\n",
       " [1, 2, 3, 3, 4, 5],\n",
       " [1, 2, 3, 5, 5, 6],\n",
       " [1, 2, 4, 5, 5, 6],\n",
       " [1, 2, 6, 6, 6, 7],\n",
       " [1, 3, 3, 4, 5, 7],\n",
       " [1, 3, 4, 4, 5, 7],\n",
       " [1, 3, 5, 5, 7, 8],\n",
       " [1, 4, 4, 5, 5, 6],\n",
       " [1, 4, 6, 6, 6, 7],\n",
       " [1, 6, 6, 6, 6, 7]]"
      ]
     },
     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "list(all_dice(6))[::20]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Ali: Let's rerun `sicherman(N)` on small values of `N`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {
    "button": false,
    "collapsed": false,
    "deletable": true,
    "jupyter": {
     "outputs_hidden": false
    },
    "new_sheet": false,
    "run_control": {
     "read_only": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "CPU times: user 121 ms, sys: 2.23 ms, total: 124 ms\n",
      "Wall time: 123 ms\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "{2: [],\n",
       " 3: [],\n",
       " 4: [([1, 2, 2, 3], [1, 3, 3, 5])],\n",
       " 5: [],\n",
       " 6: [([1, 2, 2, 3, 3, 4], [1, 3, 4, 5, 6, 8])]}"
      ]
     },
     "execution_count": 27,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%time {N: sicherman(N) for N in ints(2, 6)}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "button": false,
    "deletable": true,
    "new_sheet": false,
    "run_control": {
     "read_only": false
    }
   },
   "source": [
    "Ali: Awesome! We get the same results, but faster. Let's proceed cautiously on to 7:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {
    "button": false,
    "collapsed": false,
    "deletable": true,
    "jupyter": {
     "outputs_hidden": false
    },
    "new_sheet": false,
    "run_control": {
     "read_only": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "CPU times: user 6.85 s, sys: 96.9 ms, total: 6.95 s\n",
      "Wall time: 6.98 s\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "[]"
      ]
     },
     "execution_count": 28,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%time sicherman(7)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "button": false,
    "deletable": true,
    "new_sheet": false,
    "run_control": {
     "read_only": false
    }
   },
   "source": [
    "Ali: Too bad! Too bad we didn't get a solution for *N* = 7, and too bad that it took 60 times longer than *N* = 6. We could do *N* = 8 with a wait of a few minutes, but *N* = 9 will take hours.\n",
    "\n",
    "**Bo: We need a brand new approach; there are just too many dice. I'm not sure what to do to reduce them.**\n",
    "\n",
    "Ali: Maybe we could concentrate on considering fewer *pairs* of dice, without worrying about considering fewer dice?\n",
    "\n",
    "**Bo: How could we do that? Isn't the number of pairs always determined by the number of dice?**\n",
    "\n",
    "# Reducing the Number of Pairs\n",
    "\n",
    "Ali: Remember, we're looking for *feasible* pairs. So if there was some way of knowing ahead of time that two dice were incompatible as a pair, we wouldn't even need to consider the pair; we'd spend zero run time on them.\n",
    "\n",
    "**Bo: By incompatible, you mean they can't form a pair that is a solution.**\n",
    "\n",
    "Ali: Right. Consider this: in any valid pair, the sum of the biggest number on each die must be 2*N*.  For example, with *N* = 6:\n",
    "\n",
    "    ([1, 2, 2, 3, 3, 4], [1, 3, 4, 5, 6, 8])    sum of biggests = 4 + 8 = 12\n",
    "    ([1, 2, 3, 4, 5, 6], [1, 2, 3, 4, 5, 6])    sum of biggests = 6 + 6 = 12\n",
    "    \n",
    "So if we have a 6-sided die with biggest number 7, what dice should we consider pairing it with?\n",
    "\n",
    "**Bo: Only ones with biggest number 5.**\n",
    "\n",
    "**I get it: we sort all the die into bins labeled with their biggest number.  Then we look at each bin, and for the \"7\" bin, we pair them up with the dice in the \"5\" bin. In general, the *B* bin can only pair with the 2*N* - *B* bin.**\n",
    "\n",
    "Ali: Exactly. \n",
    "\n",
    "**Bo: Cool. I can see how that can cut the amount of work by a factor of 10 or so. But  I was hoping for a factor of a million.**\n",
    "\n",
    "Ali: There are other properties of a feasible pair.\n",
    "\n",
    "**Bo: Like what?**\n",
    "\n",
    "Ali: Well, what about the number of 2s in a pair?\n",
    "\n",
    "**Bo: Let's see.  We know that any distribution of sums has to have two 3s, and the only way to make a 3 is 2+1.  And each die has only one 1, so that means that each pair of dice has to have a total of exactly two 2s.**\n",
    "\n",
    "Ali: Does it have to be one 2 on each die?\n",
    "\n",
    "**Bo: No.  It could be one each, or it could be two on one die and none on the other. So a die with *t* twos can only pair with dice that have 2 - *t* twos.**\n",
    "\n",
    "Ali: How about the number of 3s in a pair?\n",
    "\n",
    "**Bo: Hmm. A sum has to have three 4s, and the 4s have to come from either 2 + 2 or 3 + 1.  So it is not as straightforward as with 2s. Let's make a table of allowable combinations for two dice, A and B:**\n",
    "\n",
    "|A<br>1s|A<br>2s|A<br>3s|B<br>1s|B<br>2s|B<br>3s|ways to <br>sum to 4|\n",
    "|-------|-------|-------|-------|-------|-------|-----|\n",
    "|1|0 or 2|0|1|2 or 0|3|1+3, 1+3, 1+3|\n",
    "|1|0 or 2|1|1|2 or 0|2|3+1, 1+3, 1+3|\n",
    "|1|0 or 2|2|1|2 or 0|1|3+1, 3+1, 1+3|\n",
    "|1|0 or 2|3|1|2 or 0|0|3+1, 3+1, 3+1|\n",
    "|1|1|0|1|1|2|2+2, 1+3, 1+3|\n",
    "|1|1|1|1|1|1|2+2, 3+1, 1+3|\n",
    "|1|1|2|1|1|0|2+2, 3+1, 3+1|\n",
    "\n",
    "**Bo: It looks like the number of 3s in a pair totals three if one die has both 2s, or to two if the 2s are split.**\n",
    "\n",
    "Ali: Great.  I've got one more property we could use. Consider the sums of 6-sided Sicherman and standard pairs:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {
    "button": false,
    "collapsed": false,
    "deletable": true,
    "jupyter": {
     "outputs_hidden": false
    },
    "new_sheet": false,
    "run_control": {
     "read_only": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "42"
      ]
     },
     "execution_count": 29,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sum([1, 2, 2, 3, 3, 4] + [1, 3, 4, 5, 6, 8])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {
    "button": false,
    "collapsed": false,
    "deletable": true,
    "jupyter": {
     "outputs_hidden": false
    },
    "new_sheet": false,
    "run_control": {
     "read_only": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "42"
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sum([1, 2, 3, 4, 5, 6] + [1, 2, 3, 4, 5, 6])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "button": false,
    "deletable": true,
    "new_sheet": false,
    "run_control": {
     "read_only": false
    }
   },
   "source": [
    "**Bo: They're the same. Is that [the question](http://hitchhikers.wikia.com/wiki/42) that 42 is the answer to? But does a Sicherman pair always have to have the same sum as a standard pair? I guess it does, because the sum of `distribution_of_sums(pair)` is just all the sides added up *N* times each, so two pairs have the same sum of `distribution_of_sums(pair)` if and only if they have the same sum.**\n",
    "\n",
    "Ali: So consider the die [1, 3, 3, 3, 4, 5]. What do we know about the dice that it can possibly pair with?\n",
    "\n",
    "**Bo: OK, that die has a biggest side of 5, so it can only pair with dice that have a biggest side of 12 - 5 = 7. It has a sum of 19, so it can only pair with dice that have a sum of 42 - 19 = 23. It has no 2s, so it can only pair with dice that have two 2s. And it has three 3s (and no 2s), so it can only pair with dice with no 3s.**\n",
    "\n",
    "Ali: I wonder how many such dice there are, out of all 226 `all_dice(6)`?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {
    "button": false,
    "collapsed": false,
    "deletable": true,
    "jupyter": {
     "outputs_hidden": false
    },
    "new_sheet": false,
    "run_control": {
     "read_only": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[[1, 2, 2, 5, 6, 7]]"
      ]
     },
     "execution_count": 31,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "[die for die in all_dice(6) \n",
    " if max(die) == 7 and sum(die) == 23 and die.count(2) == 2 and die.count(3) == 0]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "button": false,
    "deletable": true,
    "new_sheet": false,
    "run_control": {
     "read_only": false
    }
   },
   "source": [
    "**Bo: There's only one! So, (1, 3, 3, 3, 4, 5) only has to try to pair with one die, rather than 226. Nice improvement!**\n",
    "\n",
    "Ali: In general, what's the sum of the sides of a standard pair?\n",
    "\n",
    "**Bo: Easy, that's *N* × (*N* + 1). [Gauss](http://betterexplained.com/articles/techniques-for-adding-the-numbers-1-to-100/) knew that when he was in elementary school!**\n",
    "\n",
    "# Better Efficiency\n",
    "\n",
    "**Bo: OK, we can code this up easily enough, redefining `all_pairs(dice)` to be more efficient:**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {
    "button": false,
    "collapsed": false,
    "deletable": true,
    "jupyter": {
     "outputs_hidden": false
    },
    "new_sheet": false,
    "run_control": {
     "read_only": false
    }
   },
   "outputs": [],
   "source": [
    "from collections import defaultdict, namedtuple\n",
    "\n",
    "Label = namedtuple('Label', 'sum, max, twos, threes')\n",
    "\n",
    "def label(die) -> Label: \n",
    "    \"\"\"Return a tuple of the sum of the die, the biggest side, and the number of 2's and 3's.\"\"\"\n",
    "    return Label(sum(die), max(die), die.count(2), die.count(3))\n",
    "\n",
    "def all_pairs(dice: List[Die]) -> Iterable[Pair]:\n",
    "    \"\"\"Yield all pairs of dice that could possibly be a solution to the Sicherman problem.\"\"\"\n",
    "    table = tabulate_dice(dice)\n",
    "    N = len(dice[0])\n",
    "    for label1 in table:\n",
    "        label2 = compatible_label(label1, N)\n",
    "        if label1 <= label2 and label2 in table:\n",
    "            for A in table[label1]:\n",
    "                for B in table[label2]:\n",
    "                    yield (A, B)\n",
    "\n",
    "def tabulate_dice(dice: List[Die]) -> Dict[Label, List[Die]]:\n",
    "    \"\"\"Put all dice into bins in a hash table, keyed by label(die).\n",
    "    Each bin holds a list of dice with that key.\"\"\"\n",
    "    # Example: {(21, 6, 1): [(1, 2, 3, 4, 5, 6), (1, 2, 3, 4, 4, 7), ...]\n",
    "    table = defaultdict(list)\n",
    "    for die in dice:\n",
    "        table[label(die)].append(die)\n",
    "    return table\n",
    "\n",
    "def compatible_label(label: Label, N) -> Label: \n",
    "    \"\"\"Return a bin Label that is compatible with `label` for N-sided dice.\"\"\"\n",
    "    threes = (2 if label.twos == 1 else 3) - label.threes\n",
    "    return Label(N * (N + 1) - label.sum, 2 * N - label.max, 2 - label.twos, threes)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Bo: let's see what a table looks like. Here's with *N* = 5:**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{Label(sum=12, max=4, twos=2, threes=1): [[1, 2, 2, 3, 4]],\n",
       " Label(sum=13, max=5, twos=2, threes=1): [[1, 2, 2, 3, 5]],\n",
       " Label(sum=14, max=6, twos=2, threes=1): [[1, 2, 2, 3, 6]],\n",
       " Label(sum=14, max=5, twos=2, threes=0): [[1, 2, 2, 4, 5]],\n",
       " Label(sum=15, max=6, twos=2, threes=0): [[1, 2, 2, 4, 6]],\n",
       " Label(sum=16, max=6, twos=2, threes=0): [[1, 2, 2, 5, 6]],\n",
       " Label(sum=13, max=4, twos=1, threes=2): [[1, 2, 3, 3, 4]],\n",
       " Label(sum=14, max=5, twos=1, threes=2): [[1, 2, 3, 3, 5]],\n",
       " Label(sum=15, max=6, twos=1, threes=2): [[1, 2, 3, 3, 6]],\n",
       " Label(sum=15, max=5, twos=1, threes=1): [[1, 2, 3, 4, 5]],\n",
       " Label(sum=16, max=6, twos=1, threes=1): [[1, 2, 3, 4, 6]],\n",
       " Label(sum=17, max=6, twos=1, threes=1): [[1, 2, 3, 5, 6]],\n",
       " Label(sum=16, max=5, twos=1, threes=0): [[1, 2, 4, 4, 5]],\n",
       " Label(sum=17, max=6, twos=1, threes=0): [[1, 2, 4, 4, 6]],\n",
       " Label(sum=18, max=6, twos=1, threes=0): [[1, 2, 4, 5, 6]],\n",
       " Label(sum=19, max=6, twos=1, threes=0): [[1, 2, 5, 5, 6]],\n",
       " Label(sum=14, max=4, twos=0, threes=3): [[1, 3, 3, 3, 4]],\n",
       " Label(sum=15, max=5, twos=0, threes=3): [[1, 3, 3, 3, 5]],\n",
       " Label(sum=16, max=6, twos=0, threes=3): [[1, 3, 3, 3, 6]],\n",
       " Label(sum=16, max=5, twos=0, threes=2): [[1, 3, 3, 4, 5]],\n",
       " Label(sum=17, max=6, twos=0, threes=2): [[1, 3, 3, 4, 6]],\n",
       " Label(sum=18, max=6, twos=0, threes=2): [[1, 3, 3, 5, 6]],\n",
       " Label(sum=17, max=5, twos=0, threes=1): [[1, 3, 4, 4, 5]],\n",
       " Label(sum=18, max=6, twos=0, threes=1): [[1, 3, 4, 4, 6]],\n",
       " Label(sum=19, max=6, twos=0, threes=1): [[1, 3, 4, 5, 6]],\n",
       " Label(sum=20, max=6, twos=0, threes=1): [[1, 3, 5, 5, 6]],\n",
       " Label(sum=18, max=5, twos=0, threes=0): [[1, 4, 4, 4, 5]],\n",
       " Label(sum=19, max=6, twos=0, threes=0): [[1, 4, 4, 4, 6]],\n",
       " Label(sum=20, max=6, twos=0, threes=0): [[1, 4, 4, 5, 6]],\n",
       " Label(sum=21, max=6, twos=0, threes=0): [[1, 4, 5, 5, 6]],\n",
       " Label(sum=22, max=6, twos=0, threes=0): [[1, 5, 5, 5, 6]]}"
      ]
     },
     "execution_count": 33,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "dict(tabulate_dice(all_dice(5)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Ali: That's good! Every die has its own singleton bin, so each die will only have to be paired with one other, at most.\n",
    "\n",
    "**Bo: Let's look at some statistics for *N* = 8:**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(5728, 938, 1559)"
      ]
     },
     "execution_count": 34,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "N = 8\n",
    "table = tabulate_dice(all_dice(N))\n",
    "len(all_dice(N)), len(table), len(list(all_pairs(all_dice(N))))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Bo: This says there are 5,728 8-sided dice; they're sorted into 938 labeled bins, and `all_pairs` only needs to look at 1,559 pairs, not the 32 million pairs that the old version of `all_pairs` would need to consider.**\n",
    "\n",
    "Ali: That's fantastic. Why are there so few pairs to consider? Less than one pair per die?\n",
    "\n",
    "**Bo: I'm not sure. I can get a Counter of how many times we see each size bin, as we go through `table`:**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[(0, 603),\n",
       " (1, 123),\n",
       " (2, 54),\n",
       " (3, 36),\n",
       " (4, 28),\n",
       " (5, 23),\n",
       " (11, 15),\n",
       " (7, 12),\n",
       " (6, 12),\n",
       " (16, 5),\n",
       " (8, 5),\n",
       " (12, 5),\n",
       " (9, 4),\n",
       " (18, 3),\n",
       " (20, 3),\n",
       " (30, 2),\n",
       " (29, 2),\n",
       " (19, 2),\n",
       " (42, 1)]"
      ]
     },
     "execution_count": 35,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "Counter(len(table.get(compatible_label(label1, N), [])) for label1 in table).most_common()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "button": false,
    "deletable": true,
    "new_sheet": false,
    "run_control": {
     "read_only": false
    }
   },
   "source": [
    "**Bo: OK, that was kind of dense, but what it says is:**\n",
    "- **Of the 938 bins in the *N* = 8 table, 603 of them have a compatible bin with zero dice in it.**\n",
    "- **That means that 2/3 the time, we don't have to do *any* checking of pairs.**\n",
    "- **That's the main reason why the number of pairs is less than the number of dice.**\n",
    "- **123 bins have a compatible bin with just 1 die, 54 with just 2 dice, and so on.**\n",
    "- **So for over 80% of bins, there will be zero checking of pairs, or just one or two.**\n",
    "\n",
    "Ali: Amazing!\n",
    "\n",
    "# Final Results\n",
    "\n",
    "**Bo: I think this will be a lot faster. Let's try to go up to N=9:**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {
    "button": false,
    "collapsed": false,
    "deletable": true,
    "jupyter": {
     "outputs_hidden": false
    },
    "new_sheet": false,
    "run_control": {
     "read_only": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "CPU times: user 256 ms, sys: 4.42 ms, total: 260 ms\n",
      "Wall time: 261 ms\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "{2: [],\n",
       " 3: [],\n",
       " 4: [([1, 2, 2, 3], [1, 3, 3, 5])],\n",
       " 5: [],\n",
       " 6: [([1, 2, 2, 3, 3, 4], [1, 3, 4, 5, 6, 8])],\n",
       " 7: [],\n",
       " 8: [([1, 2, 2, 3, 3, 4, 4, 5], [1, 3, 5, 5, 7, 7, 9, 11]),\n",
       "  ([1, 2, 2, 3, 5, 6, 6, 7], [1, 3, 3, 5, 5, 7, 7, 9]),\n",
       "  ([1, 2, 3, 3, 4, 4, 5, 6], [1, 2, 5, 5, 6, 6, 9, 10])],\n",
       " 9: [([1, 2, 2, 3, 3, 3, 4, 4, 5], [1, 4, 4, 7, 7, 7, 10, 10, 13])]}"
      ]
     },
     "execution_count": 36,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%time {N: sicherman(N) for N in ints(2, 9)}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "button": false,
    "deletable": true,
    "new_sheet": false,
    "run_control": {
     "read_only": false
    }
   },
   "source": [
    "Ali: Excellent! Very fast; we got the same answers as before for 2–7, and we got **three** solutions for 8 and one for 9, all in  well under a second! \n",
    "\n",
    "I'm confident we can [go to 11](https://www.youtube.com/watch?v=hW008FcKr3Q):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "CPU times: user 2.19 s, sys: 23.9 ms, total: 2.21 s\n",
      "Wall time: 2.22 s\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "[([1, 2, 2, 3, 3, 4, 4, 5, 5, 6], [1, 3, 5, 6, 7, 8, 9, 10, 12, 14])]"
      ]
     },
     "execution_count": 37,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%time sicherman(10)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "CPU times: user 26.5 s, sys: 432 ms, total: 26.9 s\n",
      "Wall time: 28.1 s\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "[]"
      ]
     },
     "execution_count": 38,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%time sicherman(11)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Bo: Why not 12? It might take 5 minutes or so, so we can go get a coffee and come back to see the result.**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "CPU times: user 5min 34s, sys: 3.1 s, total: 5min 37s\n",
      "Wall time: 5min 48s\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "[([1, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 6],\n",
       "  [1, 4, 5, 7, 8, 9, 10, 11, 12, 14, 15, 18]),\n",
       " ([1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7],\n",
       "  [1, 3, 5, 7, 7, 9, 9, 11, 11, 13, 15, 17]),\n",
       " ([1, 2, 2, 3, 3, 4, 7, 8, 8, 9, 9, 10],\n",
       "  [1, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14]),\n",
       " ([1, 2, 2, 3, 5, 6, 6, 7, 9, 10, 10, 11],\n",
       "  [1, 3, 3, 5, 5, 7, 7, 9, 9, 11, 11, 13]),\n",
       " ([1, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7, 8],\n",
       "  [1, 2, 5, 6, 7, 8, 9, 10, 11, 12, 15, 16]),\n",
       " ([1, 2, 3, 3, 4, 5, 7, 8, 9, 9, 10, 11],\n",
       "  [1, 2, 4, 5, 5, 6, 8, 9, 9, 10, 12, 13]),\n",
       " ([1, 2, 3, 4, 4, 5, 5, 6, 6, 7, 8, 9],\n",
       "  [1, 2, 3, 7, 7, 8, 8, 9, 9, 13, 14, 15])]"
      ]
     },
     "execution_count": 39,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%time sicherman(12)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "button": false,
    "deletable": true,
    "new_sheet": false,
    "run_control": {
     "read_only": false
    }
   },
   "source": [
    "Ali: Wow! Seven distinct solutions for *N* = 12. Why seven?\n",
    "\n",
    "**Bo: I don't know. I can make a table summarizing  the results, but I can't explain or predict them.**\n",
    "\n",
    "\n",
    "|N|Count|First of Dice Pair |Second of Dice Pair|\n",
    "|--|--|:--|:--|\n",
    "|2|0| | |\n",
    "|3|0| | |\n",
    "|4 | 1 | [1, 2, 2, 3]                 |            [1, 3, 3, 5]|\n",
    "|5|0| | |\n",
    "|6 | 1 | [1, 2, 2, 3, 3, 4]            |           [1, 3, 4, 5, 6, 8]\n",
    "|7 | 0 |\n",
    "|8 | 3 | [1, 2, 2, 3, 3, 4, 4, 5]         |        [1, 3, 5, 5, 7, 7, 9, 11]|\n",
    " |||          [1, 2, 3, 3, 4, 4, 5, 6]         |        [1, 2, 5, 5, 6, 6, 9, 10]|\n",
    " |||          [1, 2, 2, 3, 5, 6, 6, 7]        |         [1, 3, 3, 5, 5, 7, 7, 9]|\n",
    "|     9  |1 | [1, 2, 2, 3, 3, 3, 4, 4, 5]        |      [1, 4, 4, 7, 7, 7, 10, 10, 13]|\n",
    "|    10 | 1 | [1, 2, 2, 3, 3, 4, 4, 5, 5, 6]      |     [1, 3, 5, 6, 7, 8, 9, 10, 12, 14]|\n",
    "|    11  |0|||\n",
    "|    12 | 7 | [1, 2, 2, 3, 3, 3, 4, 4, 4,  5,  5,  6] | [1, 4, 5, 7, 8, 9, 10, 11, 12, 14, 15, 18]|\n",
    "|||         [1, 2, 2, 3, 3, 4, 4, 5, 5,  6,  6,  7]  |[1, 3, 5, 7, 7, 9,  9, 11, 11, 13, 15, 17]|\n",
    "|||        [1, 2, 3, 3, 4, 4, 5, 5, 6,  6,  7,  8] | [1, 2, 5, 6, 7, 8,  9, 10, 11, 12, 15, 16]|\n",
    "|||         [1, 2, 3, 4, 4, 5, 5, 6, 6,  7,  8,  9] | [1, 2, 3, 7, 7, 8,  8,  9,  9, 13, 14, 15]|\n",
    "|||         [1, 2, 2, 3, 3, 4, 7, 8, 8,  9,  9, 10]  |[1, 3, 4, 5, 6, 7,  8,  9, 10, 11, 12, 14]|\n",
    "|||        [1, 2, 2, 3, 5, 6, 6, 7, 9, 10, 10, 11] | [1, 3, 3, 5, 5, 7,  7,  9,  9, 11, 11, 13]|\n",
    "|||         [1, 2, 3, 3, 4, 5, 7, 8, 9,  9, 10, 11] | [1, 2, 4, 5, 5, 6,  8,  9,  9, 10, 12, 13]|\n",
    "\n",
    "**Bo: I think we can stop here.**\n",
    "\n",
    "\n",
    "\n",
    "Ali: I agree. But I still have some questions:\n",
    "\n",
    "- Do all composite *N* have solutions?\n",
    "- Does any prime *N* have a solution?\n",
    "- Do some *N* have more than the 3 solutions that 8 has?\n",
    "- Does the fact that 8 has 3 solutions have anything to do with the fact that 8 is a perfect cube?\n",
    "- What about the 7 solutions for 12? 12 has 6 factors; does that matter?\n",
    "- The 9 perfect squares, 4 and 9, have 1 solution each; is that true of all perfect squares?\n",
    "- Could we make this 10 times faster? 100 times?\n",
    "- Could we handle greater values of *N* if we worked with [generating functions](https://en.wikipedia.org/wiki/Sicherman_dice)?\n",
    "- What about Sicherman triples (3 nonstandard dice with the same distribution of sums as 3 standard dice)?\n",
    "\n",
    "**Bo: Good questions, but I think I'd rather leave those for the reader to deal with. (Hi, reader!)**\n",
    "\n",
    "Ali: Hi reader! Let us know what you find out!\n",
    "\n",
    "<hr>\n",
    "\n",
    "*That was the end of the story. But a year later this came up:*"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# One Year Later: Sicherman Cards\n",
    "\n",
    "Ali: Hey Bo, lookat this: [Alexandre Muñiz](https://puzzlezapper.com/blog/) just posted something cool: you can get the same distribution as a pair of 6-sided dice by using a deck of 9 cards, shuffling, drawing two cards (without replacement), and taking the average of the two cards (not the sum).\n",
    "\n",
    "**Bo: What are the nine cards?**\n",
    "\n",
    "Ali: I didn't write them down. But we should be able to reconstruct the answer. First, here's a function to generate all possible decks of cards that give the same distribution as a pair of *n*-sided dice, and one to compute the distribution from drawing two cards from a deck of cards. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {},
   "outputs": [],
   "source": [
    "from statistics import mean\n",
    "from itertools import combinations\n",
    "\n",
    "def sicherman_cards(N=6) -> List[List[int]]:\n",
    "    \"\"\"All decks of cards that generate the same distribution as two n-sided dice.\"\"\"\n",
    "    dist = distribution_of_sums(nth_standard_pair(N))\n",
    "    return [deck for deck in all_decks(N) \n",
    "            if distribution_of_cards(deck) == dist]\n",
    "    \n",
    "def distribution_of_cards(deck, draw=2) -> Counter: \n",
    "    \"\"\"The distribution of possible means from drawing cards from the deck.\"\"\"\n",
    "    return Counter(map(mean, combinations(deck, draw)))"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Bo: Good. Now all that is left is `all_decks`. Should we solve it for decks of any size, or should we first just solve it for decks of 9 cards? And how do we know that 9 cards is the right number? How many cards is it for *n*-sided dice in general?**\n",
    "\n",
    "Ali: Well, for two distributions to be equal, they must have the same number of entries. For two *n*-sided dice there are *n*<sup>2</sup> entries in the distribution. For *c* cards there are (*c* choose 2) = *c* × (*c* - 1) / 2 combinations of two cards. So the only numbers of sides on a dice *n* for which we could possibly duplicate the distribution with a deck of *c* cards are  cases where *n*<sup>2</sup> = *c* × (*c* - 1) / 2. Alexandre pointed out that this means we have to find numbers that are [both square numbers and triangular numbers](https://en.wikipedia.org/wiki/Square_triangular_number). The first two non-trivial such numbers are 36 and 1225, corresponding to 6-sided dice matching a 9-card deck,  and 35-sided dice matching a 50-card deck.\n",
    "\n",
    "**Bo: Let me see if I get this. You're saying that no deck of cards could possibly work for, say, 10-sided dice?**\n",
    "\n",
    "Ali: Right, because two 10-sided dice produce a distribution with 100 entries, and there is no number of cards in a deck that would match that. With 14 cards we get 91 entries, and with 15 cards, 105.\n",
    "\n",
    "**Bo: OK. We'll write `all_decks(N)` to handle any `n`, but we won't try to make it efficient enough to handle *N* = 35. Now, what do we know about the possible cards in a deck? One thing I noticed is that either all the cards must be even, or all the cards must be odd. Because the mean of an even and an odd number is not an integer.**\n",
    "\n",
    "Ali: Right. Now we know we want to match a distribution that has one 2 in it. That means there should be one combination of two cards in the deck with mean 2, meaning sum 4.\n",
    "\n",
    "**Bo: So either the deck starts with [1, 3] or it starts with [2, 2].**\n",
    "\n",
    "Ali: Nice. Now let's figure out the biggest possible number in the deck. We want to match the distribution that has one instance of 2*n* in it. So we want exactly one combination of two cards that adds up to 4*n*, which is 24 when *n* = 6. For example, it could be [11, 13] or [12, 12]. But it could possibly be higher numbers, like [9, 15] or [10 14]. I'm not sure what the upper bound is.\n",
    "\n",
    "**Bo: To be safe, let's go up to  3*n*.**\n",
    "\n",
    "Ali: I think we're ready to write `all_decks(N)`. It will have mostly the same stucture as `all_dice(N)`, keeping track of all possible partial decks, and adding a card to each one in each possible way on each round. It will have to compute the number of cards, `Ncards` in the deck."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "metadata": {},
   "outputs": [],
   "source": [
    "def all_decks(N) -> List[Die]:\n",
    "    \"\"\"All decks of cards to possibly match the distribution of a pair of n-sided dice.\"\"\"\n",
    "    Ncards = next((c for c in range(3 * N) if N ** 2 == c * (c - 1) // 2), None)\n",
    "    if Ncards is None:\n",
    "        return []\n",
    "    decks = [ [1, 3], [2, 2] ] # The first two cards must be 1,3 or 2,2. \n",
    "    for _ in range(Ncards - 2): # Add cards, each nondecreasing, with max 2N + 2\n",
    "        decks = [deck + [card] \n",
    "                 for deck in decks \n",
    "                 for card in range(deck[-1], 3 * N, 2)]\n",
    "    return decks"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Ali: Here goes ..."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[[1, 3, 5, 5, 7, 9, 9, 11, 13]]"
      ]
     },
     "execution_count": 43,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sicherman_cards(6)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Bo: There it is. There's exactly one card deck that matches standard dice.**\n",
    "\n",
    "Ali: Actually, though, one thing bothers me. We need to solve for *n*<sup>2</sup> = *c* × (*c* - 1) / 2. But if we're trying to find a deck that matches the distribution for two *n*-sided dice, we want the same *probability* distribution, not the exact distribution counts. So for 10-sided dice, wouldn't it be okay to find a deck of cards that has a *multiple* of 10<sup>2</sup> combinations? For example, with *c* = 25 cards, we get 300 combinations. Couldn't that work?\n",
    "\n",
    "**Bo: Good idea, but I don't know if it could work. Let's try to figure it out. With 300 combinations we would need 3 ways to make a mean of 2, in order to have the right probability. We also have to make sure the deck doesn't produce a combination with mean less than 2. So the decks can't have multiple 1s. But it could start with [1, 3, 3, 3]. Or it could start with  [2, 2, 2]. But I can't tell if we could proceed from there. And we'd have to do a lot of work on efficiency to handle decks with 25 cards.**\n",
    "\n",
    "Ali: I'm also curious what the answer is for *n* = 35. We don't know if there is one solution, none, or many.  I guess we'll have to leave these questions for another day."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.15"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}