Add files via upload

This commit is contained in:
Peter Norvig 2017-08-30 23:55:29 -07:00 committed by GitHub
parent 77224809a1
commit 07cb5b5480

View File

@ -9,10 +9,10 @@
"outputs": [],
"source": [
"import re\n",
"import itertools\n",
"from collections import defaultdict\n",
"from functools import lru_cache\n",
"from math import factorial"
"from itertools import product, chain, permutations\n",
"from collections import defaultdict\n",
"from functools import lru_cache as cache\n",
"from math import factorial"
]
},
{
@ -21,7 +21,7 @@
"source": [
"# How to Count Things\n",
"\n",
"This notebook contains problems designed to show how to count things. Right now there are four example problems.\n",
"This notebook contains problems designed to show how to count things. So far there are five example problems.\n",
"\n",
"# (1) Student Records: Late, Absent, Present\n",
"\n",
@ -95,7 +95,7 @@
"\n",
"def all_strings(alphabet, N): \n",
" \"All length-N strings over the given alphabet.\"\n",
" return map(cat, itertools.product(alphabet, repeat=N))\n",
" return map(cat, product(alphabet, repeat=N))\n",
"\n",
"def quantify(iterable, pred=bool) -> int:\n",
" \"Count how many times the predicate is true of items in iterable.\"\n",
@ -263,8 +263,8 @@
"name": "stdout",
"output_type": "stream",
"text": [
"CPU times: user 1.14 ms, sys: 0 ns, total: 1.14 ms\n",
"Wall time: 1.15 ms\n"
"CPU times: user 1.21 ms, sys: 111 µs, total: 1.32 ms\n",
"Wall time: 1.35 ms\n"
]
},
{
@ -337,8 +337,8 @@
"name": "stdout",
"output_type": "stream",
"text": [
"CPU times: user 1.65 ms, sys: 43 µs, total: 1.7 ms\n",
"Wall time: 1.7 ms\n"
"CPU times: user 1.82 ms, sys: 46 µs, total: 1.87 ms\n",
"Wall time: 1.88 ms\n"
]
},
{
@ -439,7 +439,7 @@
"\n",
"def all_strings(k): \n",
" \"All strings of length k over an alphabet of k ints.\"\n",
" return itertools.product(range(k), repeat=k)"
" return product(range(k), repeat=k)"
]
},
{
@ -519,7 +519,7 @@
"source": [
"Now let's think about how to speed that up. I don't want to have to consider every possible string, because there are too many ($k^k$) of them. Can I group together many strings and just count the number of them, without enumerating each one? For example, if I knew there were 52 valid strings of length $k-1$ (and didn't know anything else about them), can I tell how many valid strings of length $k$ there are? I don't see a way to do this directly, because the number of ways to extend a valid string is dependent on the number of distinct characters in the string. If a string has $m$ distinct characters, then I can extend it in $m$ waysby repeating any of those $m$ characters, or I can introduce a first occurrence of character number $m+1$ in just 1 way.\n",
"\n",
"So I need to keep track of the number of valid strings of length $k$ that have exactly $m$ distinct characters (those characters must be exactly `range(m)`). I'll call that number `C(k, m)`. Because I can reach a recursive call to `C(k, m)` by many paths, I will use the `lru_cache` decorator to keep track of the computations that I have already done. Then I can define `how_many(k)` as the sum over all values of `m` of `C(k, m)`:"
"So I need to keep track of the number of valid strings of length $k$ that have exactly $m$ distinct characters (those characters must be exactly `range(m)`). I'll call that number `C(k, m)`. Because I can reach a recursive call to `C(k, m)` by many paths, I will use the `cache` decorator to keep track of the computations that I have already done. Then I can define `how_many(k)` as the sum over all values of `m` of `C(k, m)`:"
]
},
{
@ -530,7 +530,7 @@
},
"outputs": [],
"source": [
"@lru_cache()\n",
"@cache()\n",
"def C(k, m) -> int:\n",
" \"Count the number of valid strings of length k, that use m distinct characters.\"\n",
" return (1 if k == 0 == m else\n",
@ -606,7 +606,7 @@
}
],
"source": [
"for k in itertools.chain(range(10), range(10, 121, 10)):\n",
"for k in chain(range(10), range(10, 121, 10)):\n",
" print('{:3} {:12g}'.format(k, how_many(k)))"
]
},
@ -773,20 +773,11 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# (4) Counting Paths on a Grid\n",
"# (4) Counting Positions in Fischerandom Chess\n",
"\n",
"Consider the following grid, where the goal is to get from `S` to `G`, making only \"right\" or \"down\" moves:\n",
"In this [variant](https://en.wikipedia.org/wiki/Chess960) of chess, the pieces are set up in a random but restricted fashion. The pawns are in their regular positions, and the major white pieces are placed randomly on the first rank, with two restrictions: the bishops must be placed on opposite-color squares, and the king must be placed between the rooks. The black pieces are set up to mirror the white pieces. How many starting positions are there?\n",
"\n",
" S..........\n",
" ...........\n",
" ...........\n",
" ...........\n",
" ...........\n",
" ..........G\n",
" \n",
"One solution path would be to go right 10 times, then go down 5 times. But you could also go down 3 times, then right 10 times, then down 2 times; or take many other paths. How many paths are there? We can use the same three methods we used for the previous puzzle:\n",
"\n",
"**Method 1: Count all permutations and divide by repetitions:** Any path must consist of 10 right and 5 down moves, but they can appear in any order. Arranging 15 things in any order gives 15! = 1,307,674,368,000 possible paths. But that counts all the moves as being distinct, when actually the 10 right moves are indistinguishable, as are the 5 right moves, so we need to divide by the number of ways that they can be arranged. That gives us:"
"We can answer by generating all distinct permutations of the eight pieces and quantifying (counting) the number of permutations that are legal according to the two restrictions:"
]
},
{
@ -797,7 +788,7 @@
{
"data": {
"text/plain": [
"3003.0"
"960"
]
},
"execution_count": 26,
@ -805,6 +796,60 @@
"output_type": "execute_result"
}
],
"source": [
"from statistics import median\n",
"\n",
"def legal(pieces):\n",
" B, R, K = map(pieces.index, 'BRK')\n",
" b, r = map(cat(pieces).rindex, 'BR')\n",
" return (B % 2 != b % 2) and median([R, K, r]) == K\n",
"\n",
"quantify(set(permutations('RNBKQBNR')), legal)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"*Note:* initially I wrote `pieces.rindex`, because I forgot that while tuples, lists and strings all have an `index` method, only strings have `rindex`. How annoying! In Ruby, both strings and arrays have `index` and `rindex`. In Java and Javascript, both strings and lists/arrays have both `indexOf` and `lastIndexOf`. What's wrong with Python?"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# (5) Counting Paths on a Grid\n",
"\n",
"Consider the following grid, where the goal is to find a path from `S` to `G`, making only \"right\" or \"down\" moves:\n",
"\n",
" S..........\n",
" ...........\n",
" ...........\n",
" ...........\n",
" ...........\n",
" ..........G\n",
" \n",
"One solution path would be to go right 10 times, then go down 5 times. But you could also go down 3 times, then right 10 times, then down 2 times; or take many other paths. How many paths are there? We can use the same three methods we used for the previous puzzle:\n",
"\n",
"**Method 1: Count all permutations and divide by repetitions:** Any path must consist of 10 right and 5 down moves, but they can appear in any order. Arranging 15 things in any order gives 15! = 1,307,674,368,000 possible paths. But that counts all the moves as being distinct, when actually the 10 right moves are indistinguishable, as are the 5 down moves, so we need to divide by the number of ways that they can be arranged. That gives us:"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"3003.0"
]
},
"execution_count": 27,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"factorial(15) / factorial(10) / factorial(5)"
]
@ -813,15 +858,15 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"**Method 2: Count without repetitions**: Another way to look at it is that there will be 15 moves, so start with all 15 being \"right\" moves and then choose 5 of them to become \"up\" moves. So the answer is (15 choose 5), which leads to the same formula we just used.\n",
"**Method 2: Count without repetitions**: Another way to look at it is that there will be 15 total moves, so start with all 15 being \"right\" moves and then choose 5 of them to become \"down\" moves. So the answer is (15 choose 5), which leads to the same formula we just used.\n",
"\n",
"**Method 3: Write a program to count the paths:** We can define the function `paths(start, goal)` to count the number of paths from start location to goal location, where a location is a `(column, row)` pair of integers.\n",
"In general, the number of paths to the goal is the number of paths to the location just to the left of the goal, plus the number of paths to the location just above the goal. But there are two special cases: there is only one path (the empty path) when the start is equal to the goal, and there are zero possible paths when the goal is off the board."
"In general, the number of paths to the goal is the number of paths to the location just to the left of the goal, plus the number of paths to the location just above the goal. But there are two special cases: there is only one path (the empty path) when the start is equal to the goal, and there are zero paths when the goal is off the board."
]
},
{
"cell_type": "code",
"execution_count": 27,
"execution_count": 28,
"metadata": {},
"outputs": [
{
@ -830,13 +875,13 @@
"3003"
]
},
"execution_count": 27,
"execution_count": 28,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"@lru_cache()\n",
"@cache()\n",
"def paths(start, goal):\n",
" \"Number of paths to goal, using only 'right' and 'down' moves.\"\n",
" (col, row) = goal\n",
@ -856,15 +901,15 @@
},
{
"cell_type": "code",
"execution_count": 28,
"execution_count": 29,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"CPU times: user 220 ms, sys: 2.84 ms, total: 223 ms\n",
"Wall time: 222 ms\n"
"CPU times: user 222 ms, sys: 2.81 ms, total: 225 ms\n",
"Wall time: 225 ms\n"
]
},
{
@ -873,7 +918,7 @@
"4158251463258564744783383526326405580280466005743648708663033657304756328324008620"
]
},
"execution_count": 28,
"execution_count": 29,
"metadata": {},
"output_type": "execute_result"
}
@ -886,7 +931,7 @@
},
{
"cell_type": "code",
"execution_count": 29,
"execution_count": 30,
"metadata": {},
"outputs": [
{
@ -895,7 +940,7 @@
"True"
]
},
"execution_count": 29,
"execution_count": 30,
"metadata": {},
"output_type": "execute_result"
}
@ -908,12 +953,12 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Why bother with the recursive function when the formula works so well? Good question. One reason is that the different approaches reinforce each other by giving the same answer. Another reason is that we can modify the `paths` function to handle grids that have obstacles in them. I'll define a `Grid` data type, and any cell in the grid that is not a `'.'` will be considered an impassible barrier."
"Why bother with the recursive function when the formula works so well? Good question. One reason is that the two different approaches validate each other by giving the same answer. Another reason is that we can modify the `paths` function to handle grids that have obstacles in them. I'll define a `Grid` constructor, and any cell in the grid that is not a `'.'` will be considered an impassible barrier."
]
},
{
"cell_type": "code",
"execution_count": 30,
"execution_count": 31,
"metadata": {
"collapsed": true
},
@ -921,14 +966,17 @@
"source": [
"def Grid(text): return tuple(text.split())\n",
"\n",
"@lru_cache()\n",
"def paths2(grid, start, goal):\n",
"@cache()\n",
"def paths2(grid, start=(0, 0), goal=None):\n",
" \"Number of paths to goal, using only 'right' and 'down' moves.\"\n",
" goal = goal or bottom_right(grid)\n",
" (col, row) = goal\n",
" return (1 if goal == start else\n",
" 0 if col < 0 or row < 0 or grid[col][row] != '.' else\n",
" paths2(grid, start, (col - 1, row)) + \n",
" paths2(grid, start, (col, row - 1)))"
" paths2(grid, start, (col, row - 1)))\n",
"\n",
"def bottom_right(grid): return (len(grid) - 1, len(grid[0]) - 1)"
]
},
{
@ -940,7 +988,7 @@
},
{
"cell_type": "code",
"execution_count": 31,
"execution_count": 32,
"metadata": {},
"outputs": [
{
@ -949,7 +997,7 @@
"3003"
]
},
"execution_count": 31,
"execution_count": 32,
"metadata": {},
"output_type": "execute_result"
}
@ -961,7 +1009,8 @@
"...........\n",
"...........\n",
"...........\n",
"...........\"\"\"), (0, 0), (5, 10))"
"...........\n",
"\"\"\"))"
]
},
{
@ -971,39 +1020,6 @@
"Here's a grid where there should be only two paths:"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"2"
]
},
"execution_count": 32,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"paths2(Grid(\"\"\"\n",
"...........\n",
".........|.\n",
".........|.\n",
".........|.\n",
".--------+.\n",
"...........\"\"\"), (0, 0), (5, 10))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If we break down the wall, there should be many paths (but less than 3003):"
]
},
{
"cell_type": "code",
"execution_count": 33,
@ -1012,7 +1028,7 @@
{
"data": {
"text/plain": [
"992"
"2"
]
},
"execution_count": 33,
@ -1025,16 +1041,17 @@
"...........\n",
".........|.\n",
".........|.\n",
".........|.\n",
".--------+.\n",
"...........\n",
".-------...\n",
"...........\"\"\"), (0, 0), (5, 10))"
"\"\"\"))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can work on a larger (20 by 10) grid, but I can't verify for sure that this answer is correct:"
"If we tear down that wall, there should be many paths (but less than 3003 because some of the wall remains):"
]
},
{
@ -1045,7 +1062,7 @@
{
"data": {
"text/plain": [
"58975"
"992"
]
},
"execution_count": 34,
@ -1053,18 +1070,105 @@
"output_type": "execute_result"
}
],
"source": [
"paths2(Grid(\"\"\"\n",
"...........\n",
".........|.\n",
".........|.\n",
"...........\n",
".-------...\n",
"...........\n",
"\"\"\"))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Here's a bigger, and a much bigger example:"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"58975"
]
},
"execution_count": 35,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"paths2(Grid(r\"\"\"\n",
"................\\---\n",
"../......|..........\n",
"./..[]...|.[].|...\\.\n",
"./..()...|.().|...\\.\n",
".\\............|.....\n",
"..\\----....|..|.....\n",
".......\\...|........\n",
"\\.......\\...........\n",
"-\\.............[]...\n",
"-\\.............()...\n",
"--\\.................\n",
"---\\....../\\........\"\"\"), (0, 0), (9, 19))"
"---\\....../\\........\n",
"\"\"\"))"
]
},
{
"cell_type": "code",
"execution_count": 38,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"121480689204"
]
},
"execution_count": 38,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"paths2(Grid(r\"\"\"\n",
"....................http://www.ascii-art-generator.org/.................\n",
"........................................................................\n",
".......................WWNK0OkxdoooolooddxkO0KXW........................\n",
".................WX0kdolc::::::::::cc::::::::::clodk0XW.................\n",
"............WXOdl::::cldxkO0KKKKKKXXXXKKKKKK0Okxdlc:;;:ldOXW............\n",
".........N0dc;;coxkxxdxKXXXXXXXKddKXXKxdKXXXXXXXKxdxxxxoc;;cd0N.........\n",
"........d:,:oxkdl:,..'xXXXXXXXX0:.,;;;.:0XXXXXXXXx'..':ldkxo:,:d0W......\n",
"....W0l.;okxl;.......cKXXXXXXXXO,......,kXXXXXXXXKc.......;lxko;,l0W....\n",
"...Xo';.Od;..........;OXXXXXXXKl........lKXXXXXXXO:..........;dkd;,oX...\n",
"..K:'lO.,.............;dOKKK0x:..........:x0KKKOd;.............,xOl':K..\n",
".Xc.o0o..................,,,'...............,,,..................o0o.cX.\n",
".k';0k'..........................................................'k0;'k.\n",
".d.cKd............................................................dKc.d.\n",
".k';Ok,...........................................................kO;'k.\n",
".Xl.l0d'.........''..................................''...........0l.cX.\n",
"..Kc'cOk;......;x000ko;..,okOOd;........;dOOko,..;ok000x;......;x..'cK..\n",
"...Xd,,dkd:....oXXXXXXKkx0XXXXXKd'....'dKXXXXX0xkKXXXXXXo....;dkd;.dX...\n",
"....WKo,;lxxo;':OXXXXXXXXXXXXXXXXx,..'xXNXXXXXXXXXXXXXXO:'.oxxl;,l.W....\n",
"......WKx:,:lxxxOXXXXXXXXXXXXXXXXXx::xXXXXXXXXXXXXXXXXXOxx..:,:dKW......\n",
".........WKxl;;:ldk0KXXXXNNXXXXXXXXKKXXXXXXXXXXXXXXK0kdl:;.cxKW.........\n",
"............WN0xoc:;:clodxkO00KKKXXXXXXKKK00Okxdol::;:cox0.W............\n",
".................WNKOxdlcc::::::::::::::::::::ccldxOKNW.................\n",
"........................WWXK0OkkxxddddxxkkO0KXNW........................\n",
"........................................................................\n",
"\"\"\"))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Can you verify that these last three answers are correct?"
]
}
],