Add files via upload

2018-10-27 20:22:24 -07:00 · 2018-10-27 20:22:24 -07:00 · 10b43f62ef
commit 10b43f62ef
parent ebda2935b1
1 changed files with 484 additions and 0 deletions
--- a/ipynb/Orderable
+++ b/ipynb/Orderable
@ -0,0 +1,484 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Properly Ordered Card Hands\n",
+    "\n",
+    "The 538 Riddler [presented](https://fivethirtyeight.com/features/who-will-capture-the-most-james-bonds/) this problem by Matt Ginsberg:\n",
+    "    \n",
+    "> *You play so many card games that you’ve developed a very specific organizational obsession. When you’re dealt your hand, you want to organize it such that the cards of a given suit are grouped together and, if possible, such that no suited groups of the same color are adjacent. (Numbers don’t matter to you.) Moreover, when you receive your randomly ordered hand, you want to achieve this organization with a single motion, moving only one adjacent block of cards to some other position in your hand, maintaining the original order of that block and other cards, except for that one move.*\n",
+    "\n",
+    "> *Suppose you’re playing pitch, in which a hand has six cards. What are the odds that you can accomplish your obsessive goal? What about for another game, where a hand has N cards, somewhere between 1 and 13?*\n",
+    "\n",
+    "# Complexity\n",
+    "\n",
+    "The first thing to decide is how many `N`-card hands are there? That will tell if I can just use brute force.\n",
+    "\n",
+    "The answer is (52 choose `N`), and (52 choose 6) is 20,358,520. So it is barely feasible to use brute force there. But I notice that the problem states *\"Numbers don’t matter,\"* so I can just consider the *suits* of the cards. Then there are only 4<sup>`N`</sup> hands, which is a mere 4,096 for `N` = 6, and a barely feasible 67,108,864 for `N` = 13.\n",
+    "\n",
+    "# Deals: Hands and their Probabilities\n",
+    "\n",
+    "There are two red suits and two black suits, so I'll represent the four suits with the characters `'rbRB'`. (I also considered using `'♠️♥️♦️♣️'`.) I'll represent a hand as a string of suits: `'rrBrbr'` is a 6-card hand. I'll define `deals(N)` to return a dict of all possible hands of length `N`, each mapped to the probability of the hand. I'll use exact `Fraction` arithmetic. I'll use `lru_cache` when there are expensive computations that I don't want to repeat."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": [
+    "import re\n",
+    "from collections import defaultdict, Counter\n",
+    "from fractions import Fraction\n",
+    "from functools import lru_cache\n",
+    "\n",
+    "one   = Fraction(1)\n",
+    "suits = 'rbRB'\n",
+    "\n",
+    "@lru_cache()\n",
+    "def deals(N):\n",
+    "    \"A dict of {'chars': probability} for all hands of length N.\"\n",
+    "    if N == 0:\n",
+    "        return {'': one}\n",
+    "    else:\n",
+    "        return {hand + suit: p * (13 - hand.count(suit)) / (52 - len(hand))\n",
+    "                for (hand, p) in deals(N - 1).items()\n",
+    "                for suit in suits}"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "{'BB': Fraction(1, 17),\n",
+       " 'BR': Fraction(13, 204),\n",
+       " 'Bb': Fraction(13, 204),\n",
+       " 'Br': Fraction(13, 204),\n",
+       " 'RB': Fraction(13, 204),\n",
+       " 'RR': Fraction(1, 17),\n",
+       " 'Rb': Fraction(13, 204),\n",
+       " 'Rr': Fraction(13, 204),\n",
+       " 'bB': Fraction(13, 204),\n",
+       " 'bR': Fraction(13, 204),\n",
+       " 'bb': Fraction(1, 17),\n",
+       " 'br': Fraction(13, 204),\n",
+       " 'rB': Fraction(13, 204),\n",
+       " 'rR': Fraction(13, 204),\n",
+       " 'rb': Fraction(13, 204),\n",
+       " 'rr': Fraction(1, 17)}"
+      ]
+     },
+     "execution_count": 2,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "deals(2)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Is that right? Yes it is. The probability of `'BB'` is 1/17, beause the probability of the first `'B'` is 13/52 or 1/4, and when we deal the second card, one `'B'` is gone, so the probability is 12/51, so that simplifies to 1/4 &times; 12/51 = 3/51 = 1/17. The probability of `'BR'` is 13/204, because the probability of the `'R'` is 13/51, and 1/4 &times; 13/51 = 13/204.\n",
+    "\n",
+    "# Collapsing Hands\n",
+    "\n",
+    "I'll introduce the idea of *collapsing* a hand by replacing a run of cards of the same suit with a single card, so that: \n",
+    "\n",
+    "     collapse('BBBBBrrrrBBBB') == 'BrB'\n",
+    "     \n",
+    "I'll use the term *hand* for `'BBBBBrrrrBBBB'`, and *sequence* or *seq* for the collapsed version, `'BrB'`.\n",
+    "\n",
+    "# Properly Ordered Hands\n",
+    "\n",
+    "A hand is considered properly `ordered` if *\"the cards of a given suit are grouped together and, if possible, such that no suited groups of the same color are adjacent.\"* I was initially confused about the meaning of *\"if possible\";* Matt Ginsberg confirmed it means that the hand `'BBBbbb'` is properly ordered, because it is not possible to separate the two black suits, while `'BBBbbR'` is not properly ordered, because the red card could have been inserted between the two black runs.\n",
+    "\n",
+    "So a hand is properly ordered if, considering its collapsed sequence, each suit appears only once, and either all the colors are the same, or suits of the same color don't appear adjacent to each other."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": [
+    "def ordered(hand):\n",
+    "    \"Properly ordered if each suit run appears only once, and same color suits not adjacent.\"\n",
+    "    seq = collapse(hand)\n",
+    "    return once_each(seq) and (len(colors(seq)) == 1 or not adjacent_colors(seq))\n",
+    "                                 \n",
+    "def collapse(hand): return re.sub(r'(.)\\1+', r'\\1', hand)\n",
+    "def once_each(seq): return max(Counter(seq).values()) == 1\n",
+    "def colors(seq):    return set(seq.casefold())\n",
+    "adjacent_colors =   re.compile('rR|Rr|Bb|bB').search"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "'BrB'"
+      ]
+     },
+     "execution_count": 4,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "collapse('BBBBBrrrrBBBB')"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "False"
+      ]
+     },
+     "execution_count": 5,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "ordered('BBBbbR') "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Moving Cards to Make a Hand Ordered\n",
+    "\n",
+    "I won't try to be clever; I'm content to use brute force. I'll say that a collapsed sequence is `orderable` if any of the possible `moves` of a block of cards makes the hand `ordered`. I'll find all possible `moves`, by finding all possible  `splits` of the cards into a middle block of cards flanked by (possibly empty) left and right sequences; then all possible `inserts` of the block back into the rest of the cards.  I'll define `orderable_probability(N)` to give the probability that a random `N`-card hand is orderable.\n",
+    "Since many hands will collapse to the same sequence, I'll throw a `lru_cache` onto `orderable` so that it won't have to repeat computations."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": [
+    "@lru_cache(None)\n",
+    "def orderable(seq): return any(ordered(m) for m in moves(seq))\n",
+    "\n",
+    "def orderable_probability(N):\n",
+    "    \"What's the probability that an N-card hand is orderable?\"\n",
+    "    return sum(p for (hand, p) in deals(N).items() if orderable(collapse(hand)))\n",
+    "\n",
+    "def moves(seq):\n",
+    "    \"All possible ways of moving a single block of cards.\"\n",
+    "    return {collapse(s) for (L, M, R) in splits(seq)\n",
+    "            for s in inserts(M, L + R)}\n",
+    "\n",
+    "def inserts(block, others):\n",
+    "    \"All ways of inserting a block into the other cards.\"\n",
+    "    return [others[:i] + block + others[i:]\n",
+    "            for i in range(len(others) + 1)]\n",
+    "\n",
+    "def splits(seq):\n",
+    "    \"All ways of splitting a hand into a non-empty middle flanked by left and right parts.\"\n",
+    "    return [(seq[:i], seq[i:j], seq[j:])\n",
+    "            for i in range(len(seq))\n",
+    "            for j in range(i + 1, len(seq) + 1)]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# First Answer\n",
+    "\n",
+    "Here's our answer for 6 cards:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "Fraction(51083, 83895)"
+      ]
+     },
+     "execution_count": 7,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "orderable_probability(6)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "And an easier-to-read answer for everthing up to `N` = 7 cards:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      " 1: 100.0% = 1\n",
+      " 2: 100.0% = 1\n",
+      " 3: 100.0% = 1\n",
+      " 4: 100.0% = 1\n",
+      " 5:  85.2% = 213019/249900\n",
+      " 6:  60.9% = 51083/83895\n",
+      " 7:  37.3% = 33606799/90047300\n"
+     ]
+    }
+   ],
+   "source": [
+    "def report(Ns):\n",
+    "    \"Show the probability of orderability, for each N in Ns.\"\n",
+    "    for N in Ns:\n",
+    "        P = orderable_probability(N)\n",
+    "        print('{:2}: {:6.1%} = {}'.format(N, float(P), P))\n",
+    "        \n",
+    "report(range(1, 8))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Getting to `N` = 13\n",
+    "\n",
+    "That looks good, but if we want to get to 13-card hands, we'll have to handle 4<sup>13</sup> = 67,108,864 `deals`, which will take a long time. But I have an idea to speed things up: Consider the sequence `'rbrRrBbRB'`. It has 9 runs, and the most a properly ordered hand can have is 4 runs. What's the most number of runs that can be reduced by a singe move? One run could be reduced when we remove  block, if the cards on either side of the block are the same. When we replace the block, we can reduce 2 more, if the left and right ends of the block match the cards to the left and right of the new position. So that makes 3. Therefore, we can skip creating any hand with more than 7 runs. I will modify `deals(N)` to drop any such hands.\n",
+    "\n",
+    "Here's an example of a moving a block [bracketed] to reduce the number of runs from 7 to 4:\n",
+    "\n",
+    "       bRB[bR]Br   =>   b[bR]RBBr  =   bRBr\n",
+    "    "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 9,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": [
+    "@lru_cache()\n",
+    "def deals(N):\n",
+    "    \"A dict of {'chars': probability} for all hands of length N with under 8 runs.\"\n",
+    "    if N == 0:\n",
+    "        return {'': one}\n",
+    "    else:\n",
+    "        return {hand + suit: p * (13 - hand.count(suit)) / (52 - len(hand))\n",
+    "                for (hand, p) in deals(N - 1).items()\n",
+    "                for suit in suits\n",
+    "                if len(collapse(hand + suit)) <= 7} # <<<< CHANGE HERE"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Final Answer\n",
+    "\n",
+    "We're finaly ready to go up to `N` = 13. This will take several minutes:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 10,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      " 1: 100.0% = 1\n",
+      " 2: 100.0% = 1\n",
+      " 3: 100.0% = 1\n",
+      " 4: 100.0% = 1\n",
+      " 5:  85.2% = 213019/249900\n",
+      " 6:  60.9% = 51083/83895\n",
+      " 7:  37.3% = 33606799/90047300\n",
+      " 8:  20.2% = 29210911/144718875\n",
+      " 9:   9.9% = 133194539/1350709500\n",
+      "10:   4.4% = 367755247/8297215500\n",
+      "11:   1.9% = 22673450197/1219690678500\n",
+      "12:   0.7% = 1751664923/238130084850\n",
+      "13:   0.3% = 30785713171/11112737293000\n",
+      "CPU times: user 3min 52s, sys: 3.48 s, total: 3min 55s\n",
+      "Wall time: 4min 8s\n"
+     ]
+    }
+   ],
+   "source": [
+    "%time report(range(1, 14))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "It certainly is encouraging that, for everything up to `N` = 7,  we get the same answers as the previous `report`.\n",
+    "\n",
+    "# Unit Tests\n",
+    "\n",
+    "To gain confidence in these answers, here are some unit tests. Before declaring my answers definitively correct, I would want a lot more tests, and some independent code reviews."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 11,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "True"
+      ]
+     },
+     "execution_count": 11,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "def test():\n",
+    "    assert deals(1) == {'B': 1/4, 'R': 1/4, 'b': 1/4, 'r': 1/4}\n",
+    "    assert len(deals(6)) == 4 ** 6\n",
+    "    assert ordered('BBBBBrrrrBBBB') is False\n",
+    "    assert ordered('BBBBBrrrrRRRR') is False\n",
+    "    assert ordered('BBBbbr') is False # Bb\n",
+    "    assert ordered('BBBbbrB') is False # two B's\n",
+    "    assert ordered('BBBbbb') \n",
+    "    assert ordered('BBBbbbB') is False # two B's\n",
+    "    assert ordered('BBBBBrrrrbbbb')\n",
+    "    assert colors('BBBBBrrrrbbbb') == {'r', 'b'}\n",
+    "    assert once_each('Bb')\n",
+    "    assert once_each('BbR')\n",
+    "    assert adjacent_colors('BBBbbR')\n",
+    "    assert not adjacent_colors('BBBBBrrrrBBBB')\n",
+    "    assert collapse('BBBBBrrrrBBBB') == 'BrB'\n",
+    "    assert collapse('brBBrrRR') == 'brBrR'\n",
+    "    assert collapse('bbbbBBBrrr') == 'bBr'\n",
+    "    assert moves('bRb') == {'Rb', 'bR', 'bRb'}\n",
+    "    assert moves('bRBb') == {\n",
+    "        'BbR', 'BbRb', 'RBb', 'RbBb', 'bBRb', 'bBbR', 'bRB', 'bRBb', 'bRbB'}\n",
+    "    assert inserts('BB', '....') == [\n",
+    "        'BB....', '.BB...', '..BB..', '...BB.', '....BB']\n",
+    "    assert splits('123') == [('', '1', '23'), ('', '12', '3'), ('', '123', ''),\n",
+    "                             ('1', '2', '3'), ('1', '23', ''), ('12', '3', '')]\n",
+    "    assert orderable('bBr') # move 'r' after 'b'\n",
+    "    assert orderable('bBrbRBr') # move 'bRB' after first 'b' to get 'bbRBBrr'\n",
+    "    assert orderable('bBrbRBrb') is False\n",
+    "    return True\n",
+    "\n",
+    "test()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Table Size\n",
+    "\n",
+    "A key function in this program is `orderable(seq)`. Let's look at its cache:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 12,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "CacheInfo(hits=7198870, misses=4373, maxsize=None, currsize=4373)"
+      ]
+     },
+     "execution_count": 12,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "orderable.cache_info()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "So we looked at over 7 million hands, but only 4373 different collapsed sequences. And once we hit `N` = 7, we've seen all the sequences we're ever going to see. From `N` = 8 and up, almost all the computation goes into computing the probability of each hand, not into deciding the orderability of each sequence."
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.6.0"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}