diff --git a/ipynb/Orderable Cards.ipynb b/ipynb/Orderable Cards.ipynb
index 0fdd406..0bf936b 100644
--- a/ipynb/Orderable Cards.ipynb
+++ b/ipynb/Orderable Cards.ipynb
@@ -12,15 +12,29 @@
"\n",
"> *Suppose you’re playing pitch, in which a hand has six cards. What are the odds that you can accomplish your obsessive goal? What about for another game, where a hand has N cards, somewhere between 1 and 13?*\n",
"\n",
- "# Complexity\n",
+ "The first thing to decide is how many `N`-card hands are there? That will tell if I can just use brute force, looking at every possible hand. The answer is (52 choose `N`), so we have:\n",
"\n",
- "The first thing to decide is how many `N`-card hands are there? That will tell if I can just use brute force.\n",
+ "- 6 cards: 20,358,520 hands\n",
+ "- 13 cards: 635,013,559,600 hands \n",
"\n",
- "The answer is (52 choose `N`), and (52 choose 6) is 20,358,520. So it is barely feasible to use brute force there. But I notice that the problem states *\"Numbers don’t matter,\"* so I can just consider the *suits* of the cards. Then there are only 4`N` hands, which is a mere 4,096 for `N` = 6, and a barely feasible 67,108,864 for `N` = 13.\n",
+ "That's too many hands. \n",
+ "\n",
+ "# Abstract Hands\n",
+ "\n",
+ "I notice that the problem states *\"Numbers don’t matter,\"* so I can abstract away from *cards* to *suits*: instead of saying that the first card in this hand is the seven of spades, I can just say it is a spade. Then there are only 4N abstract hands (for N ≤ 13), so we have:\n",
+ "\n",
+ "- 6 cards: 4,096 abstract hands\n",
+ "- 13 cards: 67,108,864 abstract hands\n",
+ "\n",
+ "That's a big improvement. \n",
"\n",
"# Deals: Hands and their Probabilities\n",
"\n",
- "There are two red suits and two black suits, so I'll represent the four suits with the characters `'rbRB'`. (I also considered using `'♠️♥️♦️♣️'`.) I'll represent a hand as a string of suits: `'rrBrbr'` is a 6-card hand. I'll define `deals(N)` to return a dict of all possible hands of length `N`, each mapped to the probability of the hand. I'll use exact `Fraction` arithmetic. I'll use `lru_cache` when there are expensive computations that I don't want to repeat."
+ "There are two red suits and two black suits, so I'll represent the four suits with the characters `'rbRB'`. (I also considered using `'♠️♥️♦️♣️'`.) I'll represent an abstract hand as a string of suits: `'rrBrbr'` is a 6-card hand. I'll define `deals(N)` to return a dict of all possible abstract hands of length `N`, each mapped to the probability of the hand. \n",
+ "\n",
+ "With actual hands, every hand has the same probability, because every card is equally likely to be the next card dealt. But with abstract hands, the probability of the next suit depends on how many cards of that suit have already been dealt. If I've already dealt the 12 cards `'rrrrrrrrrrrr'`, then the probability of the next card being an `'r'` is 1/40, and the probability of it being a `'b'` is 13/40. So as I build up the abstract hands, I'll need to keep track of the number of remaining cards of each suit.\n",
+ "\n",
+ "I'll use `Fraction` to get exact arithmetic and `lru_cache` to avoid repeated computations."
]
},
{
@@ -32,7 +46,6 @@
"outputs": [],
"source": [
"import re\n",
- "from collections import defaultdict, Counter\n",
"from fractions import Fraction\n",
"from functools import lru_cache\n",
"\n",
@@ -41,7 +54,7 @@
"\n",
"@lru_cache()\n",
"def deals(N):\n",
- " \"A dict of {'chars': probability} for all hands of length N.\"\n",
+ " \"A dict of {hand: probability} for all hands of length N.\"\n",
" if N == 0:\n",
" return {'': one}\n",
" else:\n",
@@ -95,15 +108,15 @@
"\n",
"# Collapsing Hands\n",
"\n",
- "I'll introduce the idea of *collapsing* a hand by replacing a run of cards of the same suit with a single card, so that: \n",
+ "I'll introduce the idea of *collapsing* an abstract hand by replacing a run of cards of the same suit with a single card, so that: \n",
"\n",
" collapse('BBBBBrrrrBBBB') == 'BrB'\n",
" \n",
- "I'll use the term *hand* for `'BBBBBrrrrBBBB'`, and *sequence* or *seq* for the collapsed version, `'BrB'`.\n",
+ "From now on I'll use the term *hand* rather than *abstract hand* for things like `'BBBBBrrrrBBBB'`, and I'll use the terms *sequence* or *seq* for the collapsed version, `'BrB'`.\n",
"\n",
"# Properly Ordered Hands\n",
"\n",
- "A hand is considered properly `ordered` if *\"the cards of a given suit are grouped together and, if possible, such that no suited groups of the same color are adjacent.\"* I was initially confused about the meaning of *\"if possible\";* Matt Ginsberg confirmed it means that the hand `'BBBbbb'` is properly ordered, because it is not possible to separate the two black suits, while `'BBBbbR'` is not properly ordered, because the red card could have been inserted between the two black runs.\n",
+ "A hand is considered properly `ordered` if *\"the cards of a given suit are grouped together and, if possible, such that no suited groups of the same color are adjacent.\"* I was initially confused about the meaning of *\"if possible\";* Matt Ginsberg confirmed it means *\"if it is possible to separate the colors in any number of moves\"*, and thus that the hand `'BBBbbb'` is properly ordered, because it is not possible to separate the two black suits, while `'BBBbbR'` is not properly ordered, because the red card could have been inserted between the two black runs.\n",
"\n",
"So a hand is properly ordered if, considering its collapsed sequence, each suit appears only once, and either all the colors are the same, or suits of the same color don't appear adjacent to each other."
]
@@ -121,8 +134,11 @@
" seq = collapse(hand)\n",
" return once_each(seq) and (len(colors(seq)) == 1 or not adjacent_colors(seq))\n",
" \n",
- "def collapse(hand): return re.sub(r'(.)\\1+', r'\\1', hand)\n",
- "def once_each(seq): return max(Counter(seq).values()) == 1\n",
+ "def collapse(hand):\n",
+ " \"Collapse identical adjacent characters to one character.\"\n",
+ " return ''.join(hand[i] for i in range(len(hand)) if i == 0 or hand[i] != hand[i-1])\n",
+ "\n",
+ "def once_each(seq): return len(seq) == len(set(seq))\n",
"def colors(seq): return set(seq.casefold())\n",
"adjacent_colors = re.compile('rR|Rr|Bb|bB').search"
]
@@ -190,7 +206,9 @@
"outputs": [],
"source": [
"@lru_cache(None)\n",
- "def orderable(seq): return any(ordered(m) for m in moves(seq))\n",
+ "def orderable(seq): \n",
+ " \"Can this collapsed sequence be put into proper order in one move?\"\n",
+ " return any(ordered(m) for m in moves(seq))\n",
"\n",
"def orderable_probability(N):\n",
" \"What's the probability that an N-card hand is orderable?\"\n",
@@ -198,8 +216,8 @@
"\n",
"def moves(seq):\n",
" \"All possible ways of moving a single block of cards.\"\n",
- " return {collapse(s) for (L, M, R) in splits(seq)\n",
- " for s in inserts(M, L + R)}\n",
+ " return {collapse(s) for (L, block, R) in splits(seq)\n",
+ " for s in inserts(block, L + R)}\n",
"\n",
"def inserts(block, others):\n",
" \"All ways of inserting a block into the other cards.\"\n",
@@ -207,7 +225,7 @@
" for i in range(len(others) + 1)]\n",
"\n",
"def splits(seq):\n",
- " \"All ways of splitting a hand into a non-empty middle flanked by left and right parts.\"\n",
+ " \"All ways of splitting a hand into a non-empty block flanked by left and right parts.\"\n",
" return [(seq[:i], seq[i:j], seq[j:])\n",
" for i in range(len(seq))\n",
" for j in range(i + 1, len(seq) + 1)]"
@@ -248,7 +266,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
- "And an easier-to-read answer for everthing up to `N` = 7 cards:"
+ "And an easier-to-read answer for everthing up to `N` = 6 cards:"
]
},
{
@@ -262,13 +280,13 @@
"name": "stdout",
"output_type": "stream",
"text": [
+ " 0: 0.0% = 0\n",
" 1: 100.0% = 1\n",
" 2: 100.0% = 1\n",
" 3: 100.0% = 1\n",
" 4: 100.0% = 1\n",
" 5: 85.2% = 213019/249900\n",
- " 6: 60.9% = 51083/83895\n",
- " 7: 37.3% = 33606799/90047300\n"
+ " 6: 60.9% = 51083/83895\n"
]
}
],
@@ -279,7 +297,7 @@
" P = orderable_probability(N)\n",
" print('{:2}: {:6.1%} = {}'.format(N, float(P), P))\n",
" \n",
- "report(range(1, 8))"
+ "report(range(7))"
]
},
{
@@ -288,12 +306,16 @@
"source": [
"# Getting to `N` = 13\n",
"\n",
- "That looks good, but if we want to get to 13-card hands, we'll have to handle 413 = 67,108,864 `deals`, which will take a long time. But I have an idea to speed things up: Consider the sequence `'rbrRrBbRB'`. It has 9 runs, and the most a properly ordered hand can have is 4 runs. What's the most number of runs that can be reduced by a singe move? One run could be reduced when we remove block, if the cards on either side of the block are the same. When we replace the block, we can reduce 2 more, if the left and right ends of the block match the cards to the left and right of the new position. So that makes 3. Therefore, we can skip creating any hand with more than 7 runs. I will modify `deals(N)` to drop any such hands.\n",
+ "That looks good, but if we want to get to 13-card hands, we would have to handle 413 = 67,108,864 `deals`, which would take a while. I can speed things up by taking advantage of two key properties of orderable sequences:\n",
"\n",
- "Here's an example of a moving a block [bracketed] to reduce the number of runs from 7 to 4:\n",
+ "1. **An orderable sequence can have at most 7 runs.** We know that a properly ordered hand can have at most 4 runs. But a move can reduce the number of runs by only 3 at the most: one run can be reduced when we remove the block (if the cards on either side of the block are the same), and two more can be reduced when we replace the block (if the left and right ends of the block match the suits to the left and right of the new position). Here's an example of moving a block [bracketed] to reduce the number of runs from 6 to 3:\n",
"\n",
- " bRB[bR]Br => b[bR]RBBr = bRBr\n",
- " "
+ " bRB[bR]B => b[bR]RBB = bRB\n",
+ " \n",
+ "2. **Adding a suit to the end of an unorderable sequence can't make it orderable.** Even after we move a block, you can't make an unordered hand ordered by inserting one suit.\n",
+ "\n",
+ "\n",
+ "I'll redefine `deals(N)` to hold only orderable hands, redefine `orderable(seq)` to immediately reject sequences longer than 7, and redefine `orderable_probability(N)` to just add up the probabalities in `deals(N)`:"
]
},
{
@@ -306,14 +328,23 @@
"source": [
"@lru_cache()\n",
"def deals(N):\n",
- " \"A dict of {'chars': probability} for all hands of length N with under 8 runs.\"\n",
+ " \"A dict of {hand: probability} for all orderable hands of length N.\"\n",
" if N == 0:\n",
" return {'': one}\n",
" else:\n",
" return {hand + suit: p * (13 - hand.count(suit)) / (52 - len(hand))\n",
" for (hand, p) in deals(N - 1).items()\n",
" for suit in suits\n",
- " if len(collapse(hand + suit)) <= 7} # <<<< CHANGE HERE"
+ " if orderable(collapse(hand + suit))}\n",
+ " \n",
+ "@lru_cache(None)\n",
+ "def orderable(seq): \n",
+ " \"Can this collapsed sequence be put into proper order in one move?\"\n",
+ " return len(seq) <= 7 and any(ordered(m) for m in moves(seq))\n",
+ "\n",
+ "def orderable_probability(N):\n",
+ " \"What's the probability that an N-card hand is orderable?\"\n",
+ " return sum(deals(N).values())"
]
},
{
@@ -322,7 +353,7 @@
"source": [
"# Final Answer\n",
"\n",
- "We're finaly ready to go up to `N` = 13. This will take several minutes:"
+ "We're finaly ready to go up to `N` = 13:"
]
},
{
@@ -336,6 +367,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
+ " 0: 100.0% = 1\n",
" 1: 100.0% = 1\n",
" 2: 100.0% = 1\n",
" 3: 100.0% = 1\n",
@@ -349,24 +381,22 @@
"11: 1.9% = 22673450197/1219690678500\n",
"12: 0.7% = 1751664923/238130084850\n",
"13: 0.3% = 30785713171/11112737293000\n",
- "CPU times: user 3min 52s, sys: 3.48 s, total: 3min 55s\n",
- "Wall time: 4min 8s\n"
+ "CPU times: user 16.6 s, sys: 337 ms, total: 17 s\n",
+ "Wall time: 19.3 s\n"
]
}
],
"source": [
- "%time report(range(1, 14))"
+ "%time report(range(14))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
- "It certainly is encouraging that, for everything up to `N` = 7, we get the same answers as the previous `report`.\n",
+ "# Cache Sizes\n",
"\n",
- "# Unit Tests\n",
- "\n",
- "To gain confidence in these answers, here are some unit tests. Before declaring my answers definitively correct, I would want a lot more tests, and some independent code reviews."
+ "Let's look at the cache for `orderable(seq)`:"
]
},
{
@@ -379,7 +409,7 @@
{
"data": {
"text/plain": [
- "True"
+ "CacheInfo(hits=1438512, misses=1540, maxsize=None, currsize=1540)"
]
},
"execution_count": 11,
@@ -387,10 +417,44 @@
"output_type": "execute_result"
}
],
+ "source": [
+ "orderable.cache_info()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "So we looked at over 7 million hands, but only 4373 different collapsed sequences. And once we hit `N` = 7, we've seen all the sequences we're ever going to see. From `N` = 8 and up, almost all the computation goes into computing the probability of each hand, and collapsing the hand into a sequence, not into deciding the orderability of each sequence.\n",
+ "\n",
+ "We save a lot of space in the `deals(N)` caches. Instead of storing all 413 hands for `deals(13)`, the output above says that just 0.3% of the hands are orderable, so we reduced the cache size by a factor of 300 or so.\n",
+ "\n",
+ "# Unit Tests\n",
+ "\n",
+ "To gain confidence in these answers, here are some unit tests. Before declaring my answers definitively correct, I would want a lot more tests, and some independent code reviews."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 12,
+ "metadata": {
+ "collapsed": false
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "True"
+ ]
+ },
+ "execution_count": 12,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
"source": [
"def test():\n",
" assert deals(1) == {'B': 1/4, 'R': 1/4, 'b': 1/4, 'r': 1/4}\n",
- " assert len(deals(6)) == 4 ** 6\n",
" assert ordered('BBBBBrrrrBBBB') is False\n",
" assert ordered('BBBBBrrrrRRRR') is False\n",
" assert ordered('BBBbbr') is False # Bb\n",
@@ -420,44 +484,6 @@
"\n",
"test()"
]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "# Table Size\n",
- "\n",
- "A key function in this program is `orderable(seq)`. Let's look at its cache:"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 12,
- "metadata": {
- "collapsed": false
- },
- "outputs": [
- {
- "data": {
- "text/plain": [
- "CacheInfo(hits=7198870, misses=4373, maxsize=None, currsize=4373)"
- ]
- },
- "execution_count": 12,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "orderable.cache_info()"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "So we looked at over 7 million hands, but only 4373 different collapsed sequences. And once we hit `N` = 7, we've seen all the sequences we're ever going to see. From `N` = 8 and up, almost all the computation goes into computing the probability of each hand, not into deciding the orderability of each sequence."
- ]
}
],
"metadata": {