Add files via upload
This commit is contained in:
parent
56b1aab373
commit
f63d7be48a
@ -28,7 +28,9 @@
|
|||||||
"\n",
|
"\n",
|
||||||
"# My Approach\n",
|
"# My Approach\n",
|
||||||
"\n",
|
"\n",
|
||||||
"Since the referenced [word list](https://norvig.com/ngrams/enable1.txt) came from *my* web site (it is a standard Scrabble word list that I host a copy of), I felt somewhat compelled to solve this. I had worked on word games before, like Scrabble and Boggle. This puzzle is different because it deals with *unordered sets* of letters, not *ordered permutations* of letters. That makes things much easier. When I searched for an optimal 5×5 Boggle board, I couldn't exhaustively try all $26^{(5×5)} \\approx 10^{35}$ possibilites; I could only do hillclimbing to find a local maximum. But for Spelling Bee, it is feasible to try every possibility and get a guaranteed highest-scoring honeycomb. Here's a sketch of my approach:\n",
|
"Since the referenced [word list](https://norvig.com/ngrams/enable1.txt) came from *my* web site (I didn't make up the list; it is a standard Scrabble word list that I happen to host a copy of), I felt somewhat compelled to solve this one. \n",
|
||||||
|
"\n",
|
||||||
|
"This puzzle is different from other word puzzles because it deals with *unordered sets* of letters, not *ordered permutations* of letters. That makes things easier. When I searched for an optimal 5×5 Boggle board, I couldn't exhaustively try all $26^{(5×5)} \\approx 10^{35}$ possibilites; I could only do hillclimbing to find a local maximum. But for Spelling Bee, it is feasible to try every possibility and get a guaranteed highest-scoring honeycomb. Here's a sketch of my approach:\n",
|
||||||
" \n",
|
" \n",
|
||||||
"- Since order and repetition don't count, we can represent a word as a **set** of letters, which I will call a `letterset`. For simplicity I'll choose to implement that as a sorted string (not as a Python `set` or `frozenset`). For example:\n",
|
"- Since order and repetition don't count, we can represent a word as a **set** of letters, which I will call a `letterset`. For simplicity I'll choose to implement that as a sorted string (not as a Python `set` or `frozenset`). For example:\n",
|
||||||
" letterset(\"GLAM\") == letterset(\"AMALGAM\") == \"AGLM\"\n",
|
" letterset(\"GLAM\") == letterset(\"AMALGAM\") == \"AGLM\"\n",
|
||||||
@ -37,7 +39,7 @@
|
|||||||
"- Since the rules say every valid honeycomb must contain a pangram, it must be that case that every valid honeycomb *is* a pangram. That means:\n",
|
"- Since the rules say every valid honeycomb must contain a pangram, it must be that case that every valid honeycomb *is* a pangram. That means:\n",
|
||||||
" * The number of valid honeycombs is 7 times the number of pangram lettersets (because any of the 7 letters could be the center).\n",
|
" * The number of valid honeycombs is 7 times the number of pangram lettersets (because any of the 7 letters could be the center).\n",
|
||||||
" * I will consider every valid honeycomb and compute the game score for each one.\n",
|
" * I will consider every valid honeycomb and compute the game score for each one.\n",
|
||||||
" * The one with the highest game score is guaranteed to be the best possible honeycomb.\n"
|
" * The one with the highest game score is guaranteed to be the optimal honeycomb.\n"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@ -116,9 +118,7 @@
|
|||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"Note that `I`, `me` and `gem` are too short, `games` has an `S` which is not allowed, and `amalgamation` has too many distinct letters. We're left with six valid words out of the original eleven.\n",
|
"Note that `I`, `me` and `gem` are too short, `games` has an `S` which is not allowed, and `amalgamation` has too many distinct letters (8). We're left with six valid words out of the original eleven. Here are examples of the functions in action:"
|
||||||
"\n",
|
|
||||||
"Here are examples of the functions in action:"
|
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@ -366,116 +366,19 @@
|
|||||||
"source": [
|
"source": [
|
||||||
"So: we start with 172,820 words in the enable1 word list, reduce that to 44,585 valid Spelling Bee words, and find that 14,741 of those words are pangrams. \n",
|
"So: we start with 172,820 words in the enable1 word list, reduce that to 44,585 valid Spelling Bee words, and find that 14,741 of those words are pangrams. \n",
|
||||||
"\n",
|
"\n",
|
||||||
"I'm curious: what's the highest-scoring individual word?"
|
"How long will it take to run `best_honeycomb(enable1)`? Let's estimate by checking how long it takes to compute the game score of a single honeycomb:"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": 14,
|
"execution_count": 14,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [
|
|
||||||
{
|
|
||||||
"data": {
|
|
||||||
"text/plain": [
|
|
||||||
"'ANTITOTALITARIAN'"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
"execution_count": 14,
|
|
||||||
"metadata": {},
|
|
||||||
"output_type": "execute_result"
|
|
||||||
}
|
|
||||||
],
|
|
||||||
"source": [
|
|
||||||
"max(enable1, key=word_score)"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"And what are some of the pangrams?"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": 15,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [
|
|
||||||
{
|
|
||||||
"data": {
|
|
||||||
"text/plain": [
|
|
||||||
"['AARDWOLF',\n",
|
|
||||||
" 'BABBLEMENT',\n",
|
|
||||||
" 'CABEZON',\n",
|
|
||||||
" 'COLLOGUING',\n",
|
|
||||||
" 'DEMERGERING',\n",
|
|
||||||
" 'ETYMOLOGY',\n",
|
|
||||||
" 'GARROTTING',\n",
|
|
||||||
" 'IDENTIFY',\n",
|
|
||||||
" 'LARVICIDAL',\n",
|
|
||||||
" 'MORTGAGEE',\n",
|
|
||||||
" 'OVERHELD',\n",
|
|
||||||
" 'PRAWNED',\n",
|
|
||||||
" 'REINITIATED',\n",
|
|
||||||
" 'TOWHEAD',\n",
|
|
||||||
" 'UTOPIAN']"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
"execution_count": 15,
|
|
||||||
"metadata": {},
|
|
||||||
"output_type": "execute_result"
|
|
||||||
}
|
|
||||||
],
|
|
||||||
"source": [
|
|
||||||
"pangrams[::1000] # Every thousandth one"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"And what's the breakdown of reasons why words are invalid?\n"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": 16,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [
|
|
||||||
{
|
|
||||||
"data": {
|
|
||||||
"text/plain": [
|
|
||||||
"[('S', 103913), ('valid', 44585), ('long', 23400), ('short', 922)]"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
"execution_count": 16,
|
|
||||||
"metadata": {},
|
|
||||||
"output_type": "execute_result"
|
|
||||||
}
|
|
||||||
],
|
|
||||||
"source": [
|
|
||||||
"Counter(('S' if 'S' in w else 'short' if len(w) < 4 else 'long' if len(set(w)) > 7 else 'valid')\n",
|
|
||||||
" for w in open('enable1.txt').read().upper().split()).most_common()"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"There are more than twice as many words with an 'S' than there are valid words.\n",
|
|
||||||
"But how long will it take to run the computation on the big `enable1` word list? Let's see how long it takes to compute the game score of a single honeycomb:"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": 17,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [
|
"outputs": [
|
||||||
{
|
{
|
||||||
"name": "stdout",
|
"name": "stdout",
|
||||||
"output_type": "stream",
|
"output_type": "stream",
|
||||||
"text": [
|
"text": [
|
||||||
"CPU times: user 11.9 ms, sys: 506 µs, total: 12.4 ms\n",
|
"CPU times: user 11.9 ms, sys: 391 µs, total: 12.3 ms\n",
|
||||||
"Wall time: 12 ms\n"
|
"Wall time: 12 ms\n"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
@ -485,7 +388,7 @@
|
|||||||
"153"
|
"153"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
"execution_count": 17,
|
"execution_count": 14,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"output_type": "execute_result"
|
"output_type": "execute_result"
|
||||||
}
|
}
|
||||||
@ -503,7 +406,7 @@
|
|||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": 18,
|
"execution_count": 15,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [
|
"outputs": [
|
||||||
{
|
{
|
||||||
@ -512,7 +415,7 @@
|
|||||||
"20.6374"
|
"20.6374"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
"execution_count": 18,
|
"execution_count": 15,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"output_type": "execute_result"
|
"output_type": "execute_result"
|
||||||
}
|
}
|
||||||
@ -533,7 +436,7 @@
|
|||||||
"\n",
|
"\n",
|
||||||
"1. Keep the same strategy of trying every pangram, but do some precomputation that will make `game_score` much faster.\n",
|
"1. Keep the same strategy of trying every pangram, but do some precomputation that will make `game_score` much faster.\n",
|
||||||
"1. The precomputation is: compute the `letterset` and `word_score` for each word, and make a table of `{letterset: points}` giving the total number of points that can be made with each letterset. I call this a `points_table`.\n",
|
"1. The precomputation is: compute the `letterset` and `word_score` for each word, and make a table of `{letterset: points}` giving the total number of points that can be made with each letterset. I call this a `points_table`.\n",
|
||||||
"3. These calculations are independent of the honeycomb, so they only need to be done, not 14,741 × 7 times. \n",
|
"3. These calculations are independent of the honeycomb, so they need to be done only once, not 14,741 × 7 times. \n",
|
||||||
"4. Within `game_score`, generate every valid **subset** of the letters in the honeycomb. A valid subset must include the center letter, and it may or may not include each of the other 6 letters, so there are exactly $2^6 = 64$ subsets. The function `letter_subsets(honeycomb)` returns these.\n",
|
"4. Within `game_score`, generate every valid **subset** of the letters in the honeycomb. A valid subset must include the center letter, and it may or may not include each of the other 6 letters, so there are exactly $2^6 = 64$ subsets. The function `letter_subsets(honeycomb)` returns these.\n",
|
||||||
"5. To compute `game_score`, just take the sum of the 64 subset entries in the points table.\n",
|
"5. To compute `game_score`, just take the sum of the 64 subset entries in the points table.\n",
|
||||||
"\n",
|
"\n",
|
||||||
@ -543,12 +446,12 @@
|
|||||||
"Since 64 < 44,585, that's a nice optimization!\n",
|
"Since 64 < 44,585, that's a nice optimization!\n",
|
||||||
"\n",
|
"\n",
|
||||||
"\n",
|
"\n",
|
||||||
"Here's the code. Notice we've changed the interface to `game_score`; it now takes a points table, not a word list."
|
"Here's the code. Notice we've changed the interface to `game_score`; it now takes a points table, not a word list. So beware if you are jumping around in this notebook and re-executing previous cells."
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": 19,
|
"execution_count": 16,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
@ -589,7 +492,7 @@
|
|||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": 20,
|
"execution_count": 17,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [
|
"outputs": [
|
||||||
{
|
{
|
||||||
@ -598,13 +501,14 @@
|
|||||||
"['C', 'AC', 'BC', 'CD', 'ABC', 'ACD', 'BCD', 'ABCD']"
|
"['C', 'AC', 'BC', 'CD', 'ABC', 'ACD', 'BCD', 'ABCD']"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
"execution_count": 20,
|
"execution_count": 17,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"output_type": "execute_result"
|
"output_type": "execute_result"
|
||||||
}
|
}
|
||||||
],
|
],
|
||||||
"source": [
|
"source": [
|
||||||
"letter_subsets(('ABCD', 'C')) # A 4-letter honeycomb gives 2**3 = 8 subsets; 7-letter gives 64"
|
"# A 4-letter honeycomb makes 2**3 = 8 subsets; 7-letter honeycombs make 64\n",
|
||||||
|
"letter_subsets(('ABCD', 'C')) "
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@ -616,29 +520,41 @@
|
|||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": 21,
|
"execution_count": 18,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [
|
"outputs": [
|
||||||
{
|
{
|
||||||
"name": "stdout",
|
"data": {
|
||||||
"output_type": "stream",
|
"text/plain": [
|
||||||
"text": [
|
"['AMALGAM', 'GAME', 'GLAM', 'MEGAPLEX', 'CACCIATORE', 'EROTICA']"
|
||||||
"['AMALGAM', 'GAME', 'GLAM', 'MEGAPLEX', 'CACCIATORE', 'EROTICA']\n"
|
]
|
||||||
]
|
},
|
||||||
},
|
"execution_count": 18,
|
||||||
|
"metadata": {},
|
||||||
|
"output_type": "execute_result"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"source": [
|
||||||
|
"words # Remind me again what the words are?"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 19,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [
|
||||||
{
|
{
|
||||||
"data": {
|
"data": {
|
||||||
"text/plain": [
|
"text/plain": [
|
||||||
"Counter({'AGLM': 8, 'AEGM': 1, 'AEGLMPX': 15, 'ACEIORT': 31})"
|
"Counter({'AGLM': 8, 'AEGM': 1, 'AEGLMPX': 15, 'ACEIORT': 31})"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
"execution_count": 21,
|
"execution_count": 19,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"output_type": "execute_result"
|
"output_type": "execute_result"
|
||||||
}
|
}
|
||||||
],
|
],
|
||||||
"source": [
|
"source": [
|
||||||
"print(words)\n",
|
|
||||||
"points_table(words)"
|
"points_table(words)"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
@ -653,7 +569,7 @@
|
|||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": 22,
|
"execution_count": 20,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
@ -677,15 +593,15 @@
|
|||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": 23,
|
"execution_count": 21,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [
|
"outputs": [
|
||||||
{
|
{
|
||||||
"name": "stdout",
|
"name": "stdout",
|
||||||
"output_type": "stream",
|
"output_type": "stream",
|
||||||
"text": [
|
"text": [
|
||||||
"CPU times: user 2.05 s, sys: 5.2 ms, total: 2.05 s\n",
|
"CPU times: user 1.99 s, sys: 4.75 ms, total: 2 s\n",
|
||||||
"Wall time: 2.06 s\n"
|
"Wall time: 2 s\n"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@ -694,7 +610,7 @@
|
|||||||
"[3898, ('AEGINRT', 'R')]"
|
"[3898, ('AEGINRT', 'R')]"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
"execution_count": 23,
|
"execution_count": 21,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"output_type": "execute_result"
|
"output_type": "execute_result"
|
||||||
}
|
}
|
||||||
@ -708,15 +624,236 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"**Wow! 3898 is a high score!** And it took only 2 seconds to find it!\n",
|
"**Wow! 3898 is a high score!** And it took only 2 seconds to find it!\n",
|
||||||
|
"\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"# Curiosity\n",
|
||||||
"\n",
|
"\n",
|
||||||
"# Fancy Report\n",
|
"I'm curious about a bunch of things.\n",
|
||||||
"\n",
|
"\n",
|
||||||
"I'd like to see the actual words in addition to the total score, and I'm curious about how the words are divided up by letterset. Here's a function to provide such a report. I remembered that there is a `fill` function in Python (it is in the `textwrap` module) but this all turned out to be more complicated than I expected."
|
"What's the highest-scoring individual word?"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 22,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [
|
||||||
|
{
|
||||||
|
"data": {
|
||||||
|
"text/plain": [
|
||||||
|
"'ANTITOTALITARIAN'"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"execution_count": 22,
|
||||||
|
"metadata": {},
|
||||||
|
"output_type": "execute_result"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"source": [
|
||||||
|
"max(enable1, key=word_score)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"What are some of the pangrams?"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 23,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [
|
||||||
|
{
|
||||||
|
"data": {
|
||||||
|
"text/plain": [
|
||||||
|
"['AARDWOLF',\n",
|
||||||
|
" 'BABBLEMENT',\n",
|
||||||
|
" 'CABEZON',\n",
|
||||||
|
" 'COLLOGUING',\n",
|
||||||
|
" 'DEMERGERING',\n",
|
||||||
|
" 'ETYMOLOGY',\n",
|
||||||
|
" 'GARROTTING',\n",
|
||||||
|
" 'IDENTIFY',\n",
|
||||||
|
" 'LARVICIDAL',\n",
|
||||||
|
" 'MORTGAGEE',\n",
|
||||||
|
" 'OVERHELD',\n",
|
||||||
|
" 'PRAWNED',\n",
|
||||||
|
" 'REINITIATED',\n",
|
||||||
|
" 'TOWHEAD',\n",
|
||||||
|
" 'UTOPIAN']"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"execution_count": 23,
|
||||||
|
"metadata": {},
|
||||||
|
"output_type": "execute_result"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"source": [
|
||||||
|
"pangrams[::1000] # Every thousandth one"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"What's the breakdown of reasons why words are invalid?\n"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": 24,
|
"execution_count": 24,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [
|
||||||
|
{
|
||||||
|
"data": {
|
||||||
|
"text/plain": [
|
||||||
|
"[('S', 103913), ('valid', 44585), ('>7', 23400), ('<4', 922)]"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"execution_count": 24,
|
||||||
|
"metadata": {},
|
||||||
|
"output_type": "execute_result"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"source": [
|
||||||
|
"Counter('S' if 'S' in w else '<4' if len(w) < 4 else '>7' if len(set(w)) > 7 else 'valid'\n",
|
||||||
|
" for w in open('enable1.txt').read().upper().split()).most_common()"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"There are more than twice as many words with an 'S' as there are valid words.\n",
|
||||||
|
"\n",
|
||||||
|
"About the `points_table`: How many different letter subsets are there? Which ones score the most? The least?"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 25,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [
|
||||||
|
{
|
||||||
|
"data": {
|
||||||
|
"text/plain": [
|
||||||
|
"21661"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"execution_count": 25,
|
||||||
|
"metadata": {},
|
||||||
|
"output_type": "execute_result"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"source": [
|
||||||
|
"pts = points_table(enable1)\n",
|
||||||
|
"len(pts)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"That means there's about two valid words for each letterset."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 26,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [
|
||||||
|
{
|
||||||
|
"data": {
|
||||||
|
"text/plain": [
|
||||||
|
"[('AEGINRT', 832),\n",
|
||||||
|
" ('ADEGINR', 486),\n",
|
||||||
|
" ('ACILNOT', 470),\n",
|
||||||
|
" ('ACEINRT', 465),\n",
|
||||||
|
" ('CEINORT', 398),\n",
|
||||||
|
" ('AEGILNT', 392),\n",
|
||||||
|
" ('AGINORT', 380),\n",
|
||||||
|
" ('ADEINRT', 318),\n",
|
||||||
|
" ('CENORTU', 318),\n",
|
||||||
|
" ('ACDEIRT', 307),\n",
|
||||||
|
" ('AEGILNR', 304),\n",
|
||||||
|
" ('AEILNRT', 283),\n",
|
||||||
|
" ('AEGINR', 270),\n",
|
||||||
|
" ('ACINORT', 266),\n",
|
||||||
|
" ('ADENRTU', 265),\n",
|
||||||
|
" ('EGILNRT', 259),\n",
|
||||||
|
" ('AILNORT', 252),\n",
|
||||||
|
" ('DEGINR', 251),\n",
|
||||||
|
" ('AEIMNRT', 242),\n",
|
||||||
|
" ('ACELORT', 241)]"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"execution_count": 26,
|
||||||
|
"metadata": {},
|
||||||
|
"output_type": "execute_result"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"source": [
|
||||||
|
"pts.most_common(20)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 27,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [
|
||||||
|
{
|
||||||
|
"data": {
|
||||||
|
"text/plain": [
|
||||||
|
"[('IRY', 1),\n",
|
||||||
|
" ('AGOY', 1),\n",
|
||||||
|
" ('GHOY', 1),\n",
|
||||||
|
" ('GIOY', 1),\n",
|
||||||
|
" ('EKOY', 1),\n",
|
||||||
|
" ('ORUY', 1),\n",
|
||||||
|
" ('EOWY', 1),\n",
|
||||||
|
" ('ANUY', 1),\n",
|
||||||
|
" ('AGUY', 1),\n",
|
||||||
|
" ('ELUY', 1),\n",
|
||||||
|
" ('ANYZ', 1),\n",
|
||||||
|
" ('BEUZ', 1),\n",
|
||||||
|
" ('EINZ', 1),\n",
|
||||||
|
" ('EKRZ', 1),\n",
|
||||||
|
" ('ILZ', 1),\n",
|
||||||
|
" ('CIOZ', 1),\n",
|
||||||
|
" ('KNOZ', 1),\n",
|
||||||
|
" ('NOZ', 1),\n",
|
||||||
|
" ('IORZ', 1),\n",
|
||||||
|
" ('EMYZ', 1)]"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"execution_count": 27,
|
||||||
|
"metadata": {},
|
||||||
|
"output_type": "execute_result"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"source": [
|
||||||
|
"pts.most_common()[-20:]"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"# Fancy Report\n",
|
||||||
|
"\n",
|
||||||
|
"I'd like to see the actual words that each honeycomb can make, in addition to the total score, and I'm curious about how the words are divided up by letterset. Here's a function to provide such a report. I remembered that there is a `fill` function in Python (it is in the `textwrap` module) but this all turned out to be more complicated than I expected. I guess it is difficult to create a practical extraction and reporting tool. I feel you, Larry Wall."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 28,
|
||||||
"metadata": {
|
"metadata": {
|
||||||
"scrolled": false
|
"scrolled": false
|
||||||
},
|
},
|
||||||
@ -733,18 +870,22 @@
|
|||||||
" subsets = letter_subsets(honeycomb)\n",
|
" subsets = letter_subsets(honeycomb)\n",
|
||||||
" bins = group_by(words, letterset)\n",
|
" bins = group_by(words, letterset)\n",
|
||||||
" score = sum(word_score(w) for w in words if letterset(w) in subsets)\n",
|
" score = sum(word_score(w) for w in words if letterset(w) in subsets)\n",
|
||||||
" N = sum(len(bins[s]) for s in subsets)\n",
|
" nwords = sum(len(bins[s]) for s in subsets)\n",
|
||||||
" print(f'For this list of {len(words):,d} words:')\n",
|
" print(f'For this list of {Ns(len(words), \"word\")}:')\n",
|
||||||
" print(f'The {optimal}honeycomb {honeycomb} forms '\n",
|
" print(f'The {optimal}honeycomb {honeycomb} forms '\n",
|
||||||
" f'{N:,d} words for {score:,d} points.')\n",
|
" f'{Ns(nwords, \"word\")} for {Ns(score, \"point\")}.')\n",
|
||||||
" print(f'Here are the words formed, with pangrams first:\\n')\n",
|
" print(f'Here are the words formed by each subset, with pangrams first:\\n')\n",
|
||||||
" for s in sorted(subsets, key=lambda s: (-len(s), s)):\n",
|
" for s in sorted(subsets, key=lambda s: (-len(s), s)):\n",
|
||||||
" if bins[s]:\n",
|
" if bins[s]:\n",
|
||||||
" pts = sum(word_score(w) for w in bins[s])\n",
|
" pts = sum(word_score(w) for w in bins[s])\n",
|
||||||
" print(f'{s} forms {len(bins[s])} words for {pts:,d} points:')\n",
|
" print(f'{s} forms {Ns(len(bins[s]), \"word\")} for {Ns(pts, \"point\")}:')\n",
|
||||||
" words = [f'{w}({word_score(w)})' for w in sorted(bins[s])]\n",
|
" words = [f'{w}({word_score(w)})' for w in sorted(bins[s])]\n",
|
||||||
" print(fill(' '.join(words), width=80,\n",
|
" print(fill(' '.join(words), width=80,\n",
|
||||||
" initial_indent=' ', subsequent_indent=' '))\n",
|
" initial_indent=' ', subsequent_indent=' '))\n",
|
||||||
|
" \n",
|
||||||
|
"def Ns(n, things):\n",
|
||||||
|
" \"\"\"Ns(3, 'bear') => '3 bears'; Ns(1, 'world') => '1 world'\"\"\" \n",
|
||||||
|
" return f\"{n:,d} {things}{'' if n == 1 else 's'}\"\n",
|
||||||
"\n",
|
"\n",
|
||||||
"def group_by(items, key):\n",
|
"def group_by(items, key):\n",
|
||||||
" \"Group items into bins of a dict, each bin keyed by key(item).\"\n",
|
" \"Group items into bins of a dict, each bin keyed by key(item).\"\n",
|
||||||
@ -756,7 +897,7 @@
|
|||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": 25,
|
"execution_count": 29,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [
|
"outputs": [
|
||||||
{
|
{
|
||||||
@ -765,11 +906,11 @@
|
|||||||
"text": [
|
"text": [
|
||||||
"For this list of 6 words:\n",
|
"For this list of 6 words:\n",
|
||||||
"The honeycomb ('AEGLMPX', 'G') forms 4 words for 24 points.\n",
|
"The honeycomb ('AEGLMPX', 'G') forms 4 words for 24 points.\n",
|
||||||
"Here are the words formed, with pangrams first:\n",
|
"Here are the words formed by each subset, with pangrams first:\n",
|
||||||
"\n",
|
"\n",
|
||||||
"AEGLMPX forms 1 words for 15 points:\n",
|
"AEGLMPX forms 1 word for 15 points:\n",
|
||||||
" MEGAPLEX(15)\n",
|
" MEGAPLEX(15)\n",
|
||||||
"AEGM forms 1 words for 1 points:\n",
|
"AEGM forms 1 word for 1 point:\n",
|
||||||
" GAME(1)\n",
|
" GAME(1)\n",
|
||||||
"AGLM forms 2 words for 8 points:\n",
|
"AGLM forms 2 words for 8 points:\n",
|
||||||
" AMALGAM(7) GLAM(1)\n"
|
" AMALGAM(7) GLAM(1)\n"
|
||||||
@ -782,7 +923,7 @@
|
|||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": 26,
|
"execution_count": 30,
|
||||||
"metadata": {
|
"metadata": {
|
||||||
"scrolled": false
|
"scrolled": false
|
||||||
},
|
},
|
||||||
@ -793,7 +934,7 @@
|
|||||||
"text": [
|
"text": [
|
||||||
"For this list of 44,585 words:\n",
|
"For this list of 44,585 words:\n",
|
||||||
"The optimal honeycomb ('AEGINRT', 'R') forms 537 words for 3,898 points.\n",
|
"The optimal honeycomb ('AEGINRT', 'R') forms 537 words for 3,898 points.\n",
|
||||||
"Here are the words formed, with pangrams first:\n",
|
"Here are the words formed by each subset, with pangrams first:\n",
|
||||||
"\n",
|
"\n",
|
||||||
"AEGINRT forms 50 words for 832 points:\n",
|
"AEGINRT forms 50 words for 832 points:\n",
|
||||||
" AERATING(15) AGGREGATING(18) ARGENTINE(16) ARGENTITE(16) ENTERTAINING(19)\n",
|
" AERATING(15) AGGREGATING(18) ARGENTINE(16) ARGENTITE(16) ENTERTAINING(19)\n",
|
||||||
@ -858,9 +999,9 @@
|
|||||||
" AGRARIAN(8) AIRING(6) ANGARIA(7) ARRAIGN(7) ARRAIGNING(10) ARRANGING(9)\n",
|
" AGRARIAN(8) AIRING(6) ANGARIA(7) ARRAIGN(7) ARRAIGNING(10) ARRANGING(9)\n",
|
||||||
" GARAGING(8) GARNI(5) GARRING(7) GNARRING(8) GRAIN(5) GRAINING(8) INGRAIN(7)\n",
|
" GARAGING(8) GARNI(5) GARRING(7) GNARRING(8) GRAIN(5) GRAINING(8) INGRAIN(7)\n",
|
||||||
" INGRAINING(10) RAGGING(7) RAGING(6) RAINING(7) RANGING(7) RARING(6)\n",
|
" INGRAINING(10) RAGGING(7) RAGING(6) RAINING(7) RANGING(7) RARING(6)\n",
|
||||||
"AGIRT forms 1 words for 5 points:\n",
|
"AGIRT forms 1 word for 5 points:\n",
|
||||||
" TRAGI(5)\n",
|
" TRAGI(5)\n",
|
||||||
"AGNRT forms 1 words for 5 points:\n",
|
"AGNRT forms 1 word for 5 points:\n",
|
||||||
" GRANT(5)\n",
|
" GRANT(5)\n",
|
||||||
"AINRT forms 9 words for 64 points:\n",
|
"AINRT forms 9 words for 64 points:\n",
|
||||||
" ANTIAIR(7) ANTIAR(6) ANTIARIN(8) INTRANT(7) IRRITANT(8) RIANT(5) TITRANT(7)\n",
|
" ANTIAIR(7) ANTIAR(6) ANTIARIN(8) INTRANT(7) IRRITANT(8) RIANT(5) TITRANT(7)\n",
|
||||||
@ -943,7 +1084,7 @@
|
|||||||
" EGER(1) EGGER(5) GREE(1) GREEGREE(8)\n",
|
" EGER(1) EGGER(5) GREE(1) GREEGREE(8)\n",
|
||||||
"EIR forms 2 words for 11 points:\n",
|
"EIR forms 2 words for 11 points:\n",
|
||||||
" EERIE(5) EERIER(6)\n",
|
" EERIE(5) EERIER(6)\n",
|
||||||
"ENR forms 1 words for 1 points:\n",
|
"ENR forms 1 word for 1 point:\n",
|
||||||
" ERNE(1)\n",
|
" ERNE(1)\n",
|
||||||
"ERT forms 7 words for 27 points:\n",
|
"ERT forms 7 words for 27 points:\n",
|
||||||
" RETE(1) TEETER(6) TERETE(6) TERRET(6) TETTER(6) TREE(1) TRET(1)\n",
|
" RETE(1) TEETER(6) TERETE(6) TERRET(6) TETTER(6) TREE(1) TRET(1)\n",
|
||||||
@ -967,7 +1108,7 @@
|
|||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": 27,
|
"execution_count": 31,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
@ -979,7 +1120,7 @@
|
|||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": 28,
|
"execution_count": 32,
|
||||||
"metadata": {
|
"metadata": {
|
||||||
"scrolled": false
|
"scrolled": false
|
||||||
},
|
},
|
||||||
@ -990,7 +1131,7 @@
|
|||||||
"text": [
|
"text": [
|
||||||
"For this list of 98,141 words:\n",
|
"For this list of 98,141 words:\n",
|
||||||
"The optimal honeycomb ('AEINRST', 'E') forms 1,179 words for 8,681 points.\n",
|
"The optimal honeycomb ('AEINRST', 'E') forms 1,179 words for 8,681 points.\n",
|
||||||
"Here are the words formed, with pangrams first:\n",
|
"Here are the words formed by each subset, with pangrams first:\n",
|
||||||
"\n",
|
"\n",
|
||||||
"AEINRST forms 86 words for 1,381 points:\n",
|
"AEINRST forms 86 words for 1,381 points:\n",
|
||||||
" ANESTRI(14) ANTISERA(15) ANTISTRESS(17) ANTSIER(14) ARENITES(15)\n",
|
" ANESTRI(14) ANTISERA(15) ANTISTRESS(17) ANTSIER(14) ARENITES(15)\n",
|
||||||
@ -1164,7 +1305,7 @@
|
|||||||
" AERIE(5) AERIER(6) AIRER(5) AIRIER(6)\n",
|
" AERIE(5) AERIER(6) AIRER(5) AIRIER(6)\n",
|
||||||
"AEIS forms 2 words for 13 points:\n",
|
"AEIS forms 2 words for 13 points:\n",
|
||||||
" EASIES(6) SASSIES(7)\n",
|
" EASIES(6) SASSIES(7)\n",
|
||||||
"AEIT forms 1 words for 6 points:\n",
|
"AEIT forms 1 word for 6 points:\n",
|
||||||
" TATTIE(6)\n",
|
" TATTIE(6)\n",
|
||||||
"AENR forms 9 words for 40 points:\n",
|
"AENR forms 9 words for 40 points:\n",
|
||||||
" ANEAR(5) ARENA(5) EARN(1) EARNER(6) NEAR(1) NEARER(6) RANEE(5) REEARN(6)\n",
|
" ANEAR(5) ARENA(5) EARN(1) EARNER(6) NEAR(1) NEARER(6) RANEE(5) REEARN(6)\n",
|
||||||
@ -1234,15 +1375,15 @@
|
|||||||
" ASEA(1) ASSES(5) ASSESS(6) ASSESSES(8) EASE(1) EASES(5) SASSES(6) SEAS(1)\n",
|
" ASEA(1) ASSES(5) ASSESS(6) ASSESSES(8) EASE(1) EASES(5) SASSES(6) SEAS(1)\n",
|
||||||
"AET forms 2 words for 2 points:\n",
|
"AET forms 2 words for 2 points:\n",
|
||||||
" TATE(1) TEAT(1)\n",
|
" TATE(1) TEAT(1)\n",
|
||||||
"EIN forms 1 words for 1 points:\n",
|
"EIN forms 1 word for 1 point:\n",
|
||||||
" NINE(1)\n",
|
" NINE(1)\n",
|
||||||
"EIR forms 2 words for 11 points:\n",
|
"EIR forms 2 words for 11 points:\n",
|
||||||
" EERIE(5) EERIER(6)\n",
|
" EERIE(5) EERIER(6)\n",
|
||||||
"EIS forms 7 words for 35 points:\n",
|
"EIS forms 7 words for 35 points:\n",
|
||||||
" ISSEI(5) ISSEIS(6) SEIS(1) SEISE(5) SEISES(6) SISES(5) SISSIES(7)\n",
|
" ISSEI(5) ISSEIS(6) SEIS(1) SEISE(5) SEISES(6) SISES(5) SISSIES(7)\n",
|
||||||
"EIT forms 1 words for 6 points:\n",
|
"EIT forms 1 word for 6 points:\n",
|
||||||
" TITTIE(6)\n",
|
" TITTIE(6)\n",
|
||||||
"ENR forms 1 words for 1 points:\n",
|
"ENR forms 1 word for 1 point:\n",
|
||||||
" ERNE(1)\n",
|
" ERNE(1)\n",
|
||||||
"ENS forms 6 words for 20 points:\n",
|
"ENS forms 6 words for 20 points:\n",
|
||||||
" NESS(1) NESSES(6) SEEN(1) SENE(1) SENSE(5) SENSES(6)\n",
|
" NESS(1) NESSES(6) SEEN(1) SENE(1) SENSE(5) SENSES(6)\n",
|
||||||
@ -1257,7 +1398,7 @@
|
|||||||
" SESTET(6) SESTETS(7) SETS(1) SETT(1) SETTEE(6) SETTEES(7) SETTS(5) STET(1)\n",
|
" SESTET(6) SESTETS(7) SETS(1) SETT(1) SETTEE(6) SETTEES(7) SETTS(5) STET(1)\n",
|
||||||
" STETS(5) TEES(1) TEST(1) TESTEE(6) TESTEES(7) TESTES(6) TESTS(5) TETS(1)\n",
|
" STETS(5) TEES(1) TEST(1) TESTEE(6) TESTEES(7) TESTES(6) TESTS(5) TETS(1)\n",
|
||||||
" TSETSE(6) TSETSES(7)\n",
|
" TSETSE(6) TSETSES(7)\n",
|
||||||
"EN forms 1 words for 1 points:\n",
|
"EN forms 1 word for 1 point:\n",
|
||||||
" NENE(1)\n",
|
" NENE(1)\n",
|
||||||
"ES forms 3 words for 7 points:\n",
|
"ES forms 3 words for 7 points:\n",
|
||||||
" ESES(1) ESSES(5) SEES(1)\n"
|
" ESES(1) ESSES(5) SEES(1)\n"
|
||||||
@ -1272,9 +1413,11 @@
|
|||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"Here are the highest-scoring honeycombs, with and without an S:\n",
|
"# Pictures\n",
|
||||||
"\n",
|
"\n",
|
||||||
"<img src=\"http://norvig.com/honeycombs.png\" width=\"400\">"
|
"Here are pictures for the highest-scoring honeycombs, with and without an S:\n",
|
||||||
|
"\n",
|
||||||
|
"<img src=\"http://norvig.com/honeycombs.png\" width=\"350\">"
|
||||||
]
|
]
|
||||||
}
|
}
|
||||||
],
|
],
|
||||||
|
Loading…
Reference in New Issue
Block a user