Add files via upload
This commit is contained in:
parent
1e7a6f05fc
commit
bc3d113df5
@ -26,12 +26,13 @@
|
|||||||
"\n",
|
"\n",
|
||||||
"# Approach to a Solution\n",
|
"# Approach to a Solution\n",
|
||||||
"\n",
|
"\n",
|
||||||
"Since the word list was on my web site (it is a standard Scrabble word list that I happen to host a copy of), I felt somewhat compelled to submit an answer. I had worked on word puzzles before, like Scrabble and Boggle. My first thought is that this puzzle is rather different because it deals with *unordered sets* of letters, not *ordered permutations* of letters. So I had a strategy that differs from past word puzzles: \n",
|
"Since the word list was on my web site (it is a standard Scrabble word list that I happen to host a copy of), I felt somewhat compelled to submit an answer. I had worked on word puzzles before, like Scrabble and Boggle. My first thought is that this puzzle is rather different because it deals with *unordered sets* of letters, not *ordered permutations* of letters. That makes things easier because, roughly speaking, ($26$ choose $7$) is 10,000 times smaller than $26^7$. So I'm optimistic that I can find an optimal solution, not just hillclimb to a pretty good solution (like I did in trying to find a high-scoring Boggle board). I'm developing a sketch of a strategy:\n",
|
||||||
|
" \n",
|
||||||
"\n",
|
"\n",
|
||||||
"- Represent a word (and a honeycomb) as a set of letters, which I'll implement as a sorted string. So both \"GLAM\" and \"AMALGAM\" will be represented by the string `\"AGLM\"`.\n",
|
"- Represent a word as a set of letters, which I'll implement as a sorted string, as returned by `letterset(\"WORD\")`. So both GLAM and AMALGAM will be represented by the string `\"AGLM\"`. (Note: I could have used a `frozenset`, but strings have a more compact printed representation, making them easier to debug, and they take up less memory. I didn't need any `set` operations like union and intersection.) \n",
|
||||||
"- To represent a honeycomb, use the sorted letter set, but also keep track of the center. So the honeycomb in the image above would be represented by `('AEGLMPX', 'G')`.\n",
|
"- To represent a honeycomb, use a letterset, but also keep track of the center. So the honeycomb in the image above would be represented by `('AEGLMPX', 'G')`.\n",
|
||||||
"- Since every honeycomb must contain a pangram, I can find the best honeycomb by considering all possible pangrams (with all possible centers) and taking the one that scores highest.\n",
|
"- Since every honeycomb must contain a pangram, I can find the best honeycomb by considering all possible pangrams and all possible centers and taking the one that scores highest.\n",
|
||||||
"- I'm hoping there aren't too many candidate pangrams, but I can do some pre-computation to make it easier to find the best one."
|
"- I'm hoping there aren't too many candidate pangrams. I can do some pre-computation to make the computation of scores faster."
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@ -40,7 +41,7 @@
|
|||||||
"source": [
|
"source": [
|
||||||
"# Words, Word Scores, Pangrams, and Lettersets\n",
|
"# Words, Word Scores, Pangrams, and Lettersets\n",
|
||||||
"\n",
|
"\n",
|
||||||
"Here I load some modules and define four basic functions about words:"
|
"I'll start by loading some modules and defining four basic functions about words:"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@ -67,7 +68,7 @@
|
|||||||
"def word_score(word) -> int: \n",
|
"def word_score(word) -> int: \n",
|
||||||
" N = len(word)\n",
|
" N = len(word)\n",
|
||||||
" bonus = (7 if is_pangram(word) else 0)\n",
|
" bonus = (7 if is_pangram(word) else 0)\n",
|
||||||
" return (0 if N < 4 else 1 if N == 4 else N + bonus)\n",
|
" return (1 if N == 4 else N + bonus)\n",
|
||||||
"\n",
|
"\n",
|
||||||
"def is_pangram(word) -> bool: \n",
|
"def is_pangram(word) -> bool: \n",
|
||||||
" \"\"\"Does a word use all 7 letters (some maybe more than once)?\"\"\"\n",
|
" \"\"\"Does a word use all 7 letters (some maybe more than once)?\"\"\"\n",
|
||||||
@ -89,7 +90,18 @@
|
|||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": 3,
|
"execution_count": 3,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [
|
||||||
|
{
|
||||||
|
"data": {
|
||||||
|
"text/plain": [
|
||||||
|
"{'AMALGAM', 'GAME', 'GLAM', 'MAPLE', 'MEGAPLEX', 'PELAGIC'}"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"execution_count": 3,
|
||||||
|
"metadata": {},
|
||||||
|
"output_type": "execute_result"
|
||||||
|
}
|
||||||
|
],
|
||||||
"source": [
|
"source": [
|
||||||
"words = Words('amalgam amalgamation game games gem glam maple megaplex pelagic I me')\n",
|
"words = Words('amalgam amalgamation game games gem glam maple megaplex pelagic I me')\n",
|
||||||
"words"
|
"words"
|
||||||
@ -99,14 +111,27 @@
|
|||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"Note that `I`, `me` and `gem` are too short, `games` has an `s` which is not allowed, and `amalgamation` has too many distinct letters. Here are examples of the functions in action:"
|
"Note that `I`, `me` and `gem` are too short, `games` has an `s` which is not allowed, and `amalgamation` has too many distinct letters. \n",
|
||||||
|
"\n",
|
||||||
|
"Here are examples of the functions in action:"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": 4,
|
"execution_count": 4,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [
|
||||||
|
{
|
||||||
|
"data": {
|
||||||
|
"text/plain": [
|
||||||
|
"{'GLAM': 1, 'MEGAPLEX': 15, 'GAME': 1, 'MAPLE': 5, 'PELAGIC': 14, 'AMALGAM': 7}"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"execution_count": 4,
|
||||||
|
"metadata": {},
|
||||||
|
"output_type": "execute_result"
|
||||||
|
}
|
||||||
|
],
|
||||||
"source": [
|
"source": [
|
||||||
"{w: word_score(w) for w in words}"
|
"{w: word_score(w) for w in words}"
|
||||||
]
|
]
|
||||||
@ -115,7 +140,18 @@
|
|||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": 5,
|
"execution_count": 5,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [
|
||||||
|
{
|
||||||
|
"data": {
|
||||||
|
"text/plain": [
|
||||||
|
"{'MEGAPLEX', 'PELAGIC'}"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"execution_count": 5,
|
||||||
|
"metadata": {},
|
||||||
|
"output_type": "execute_result"
|
||||||
|
}
|
||||||
|
],
|
||||||
"source": [
|
"source": [
|
||||||
"{w for w in words if is_pangram(w)}"
|
"{w for w in words if is_pangram(w)}"
|
||||||
]
|
]
|
||||||
@ -124,7 +160,23 @@
|
|||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": 6,
|
"execution_count": 6,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [
|
||||||
|
{
|
||||||
|
"data": {
|
||||||
|
"text/plain": [
|
||||||
|
"{'GLAM': 'AGLM',\n",
|
||||||
|
" 'MEGAPLEX': 'AEGLMPX',\n",
|
||||||
|
" 'GAME': 'AEGM',\n",
|
||||||
|
" 'MAPLE': 'AELMP',\n",
|
||||||
|
" 'PELAGIC': 'ACEGILP',\n",
|
||||||
|
" 'AMALGAM': 'AGLM'}"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"execution_count": 6,
|
||||||
|
"metadata": {},
|
||||||
|
"output_type": "execute_result"
|
||||||
|
}
|
||||||
|
],
|
||||||
"source": [
|
"source": [
|
||||||
"{w: letterset(w) for w in words}"
|
"{w: letterset(w) for w in words}"
|
||||||
]
|
]
|
||||||
@ -188,7 +240,7 @@
|
|||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"Notice that only 1/4 of the words are valid: the others are either shorter than 4 letters in length, or contain an 'S', or have more than 7 distinct letters."
|
"Notice that only about 1/4 of the words are valid: the others are either shorter than 4 letters in length, or contain an 'S', or have more than 7 distinct letters."
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@ -199,16 +251,16 @@
|
|||||||
{
|
{
|
||||||
"data": {
|
"data": {
|
||||||
"text/plain": [
|
"text/plain": [
|
||||||
"['PUNGENCY',\n",
|
"['TRIFLED',\n",
|
||||||
" 'MUCKRAKER',\n",
|
" 'FLYPAPER',\n",
|
||||||
" 'BECUDGELED',\n",
|
" 'CANTONMENT',\n",
|
||||||
" 'PERICARDIA',\n",
|
" 'COLLOGUING',\n",
|
||||||
" 'PYRONINE',\n",
|
" 'TRUNDLER',\n",
|
||||||
" 'HARBORED',\n",
|
" 'UNPITIED',\n",
|
||||||
" 'FIREDRAKE',\n",
|
" 'KNUCKLING',\n",
|
||||||
" 'POUNDAL',\n",
|
" 'DEVALUATE',\n",
|
||||||
" 'MIAULED',\n",
|
" 'UNAFRAID',\n",
|
||||||
" 'TAGMEMIC']"
|
" 'INJECTANT']"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
"execution_count": 9,
|
"execution_count": 9,
|
||||||
@ -245,86 +297,71 @@
|
|||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"Only 14,741 candidate pangrams. I'm encouraged about my initial approach to a solution. "
|
"So: 14,741 candidate pangrams. Feasible.\n",
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"# Efficiency\n",
|
|
||||||
"\n",
|
"\n",
|
||||||
"The goal is to find the honeycomb that maximizes the total score of the words we can make. I've chosen to go down the path of considering all 14,741 pangrams, and all 7 centers for each pangram, for a total of 103,187 candidate honeycombs. \n",
|
"I'm also curious: what's the highest-scoring individual word?"
|
||||||
"\n",
|
|
||||||
"I could check each honeycomb against each word. But there are 44,585 valid words in `enable1`.\n",
|
|
||||||
"I'll make things more efficient by *caching* some important information: \n",
|
|
||||||
"- For each center letter, I'll collect all the lettersets of words that include that letter (thereby excluding words that don't).\n",
|
|
||||||
"- For each of those lettersets, I'll precompute the points scored (possibly over several words with the same letterset).\n",
|
|
||||||
"\n",
|
|
||||||
"I put this in a dict that I call a *scoring table*. This is efficient because:\n",
|
|
||||||
"- We do the `scoring_table` calculation once, and then use it 103,187 times.\n",
|
|
||||||
"- For each candidate honeycomb we don't have to consider all the possible valid words. As we will see below, we will only need 64 table lookups per honeycomb, not 44,585."
|
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": 11,
|
"execution_count": 11,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [
|
||||||
|
{
|
||||||
|
"data": {
|
||||||
|
"text/plain": [
|
||||||
|
"('ANTITOTALITARIAN', True, 23)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"execution_count": 11,
|
||||||
|
"metadata": {},
|
||||||
|
"output_type": "execute_result"
|
||||||
|
}
|
||||||
|
],
|
||||||
"source": [
|
"source": [
|
||||||
"alphabet = 'ABCDEFGHIJKLMNOPQRTUVWXYZ'\n",
|
"w = max(enable1, key=word_score)\n",
|
||||||
|
"w, is_pangram(w), word_score(w)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"# Efficiency: Caching a Scoring Table\n",
|
||||||
"\n",
|
"\n",
|
||||||
"def scoring_table(words) -> dict:\n",
|
"The goal is to find the honeycomb that maximizes the `game_score`: the total score of all words that can be made with the honeycomb. I've chosen to go down the path of considering all 14,741 pangrams, and all 7 centers for each pangram, for a total of 103,187 candidate honeycombs. \n",
|
||||||
" \"\"\"Return a dict of {C: {letterset: sum_of_word_scores}} for all center letters C.\"\"\"\n",
|
"I'll make things more efficient by *caching* some important information so I don't need to recompute it 103,187 times.\n",
|
||||||
" table = {C: Counter() for C in alphabet}\n",
|
"- For each word, I'll precompute the `letterset` and the `word_score`.\n",
|
||||||
" for w in words:\n",
|
"- For each letterset, I'll precompute the total `word_score` points (over all the words with that letterset)."
|
||||||
" score = word_score(w)\n",
|
|
||||||
" s = letterset(w)\n",
|
|
||||||
" for C in s:\n",
|
|
||||||
" table[C][s] += score\n",
|
|
||||||
" return table"
|
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": 12,
|
"execution_count": 12,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"def scoring_table(words) -> dict:\n",
|
||||||
|
" \"\"\"Return a dict of {letterset: sum_of_word_scores} over words.\"\"\"\n",
|
||||||
|
" table = Counter()\n",
|
||||||
|
" for w in words:\n",
|
||||||
|
" s = letterset(w)\n",
|
||||||
|
" table[s] += word_score(w)\n",
|
||||||
|
" return table"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 13,
|
||||||
|
"metadata": {},
|
||||||
"outputs": [
|
"outputs": [
|
||||||
{
|
{
|
||||||
"data": {
|
"data": {
|
||||||
"text/plain": [
|
"text/plain": [
|
||||||
"{'A': Counter({'AGLM': 8,\n",
|
"Counter({'AGLM': 8, 'AEGLMPX': 15, 'AEGM': 1, 'AELMP': 5, 'ACEGILP': 14})"
|
||||||
" 'AEGLMPX': 15,\n",
|
|
||||||
" 'ACEGILP': 14,\n",
|
|
||||||
" 'AEGM': 1,\n",
|
|
||||||
" 'AELMP': 5}),\n",
|
|
||||||
" 'B': Counter(),\n",
|
|
||||||
" 'C': Counter({'ACEGILP': 14}),\n",
|
|
||||||
" 'D': Counter(),\n",
|
|
||||||
" 'E': Counter({'AEGLMPX': 15, 'ACEGILP': 14, 'AEGM': 1, 'AELMP': 5}),\n",
|
|
||||||
" 'F': Counter(),\n",
|
|
||||||
" 'G': Counter({'AGLM': 8, 'AEGLMPX': 15, 'ACEGILP': 14, 'AEGM': 1}),\n",
|
|
||||||
" 'H': Counter(),\n",
|
|
||||||
" 'I': Counter({'ACEGILP': 14}),\n",
|
|
||||||
" 'J': Counter(),\n",
|
|
||||||
" 'K': Counter(),\n",
|
|
||||||
" 'L': Counter({'AGLM': 8, 'AEGLMPX': 15, 'ACEGILP': 14, 'AELMP': 5}),\n",
|
|
||||||
" 'M': Counter({'AGLM': 8, 'AEGLMPX': 15, 'AEGM': 1, 'AELMP': 5}),\n",
|
|
||||||
" 'N': Counter(),\n",
|
|
||||||
" 'O': Counter(),\n",
|
|
||||||
" 'P': Counter({'AEGLMPX': 15, 'ACEGILP': 14, 'AELMP': 5}),\n",
|
|
||||||
" 'Q': Counter(),\n",
|
|
||||||
" 'R': Counter(),\n",
|
|
||||||
" 'T': Counter(),\n",
|
|
||||||
" 'U': Counter(),\n",
|
|
||||||
" 'V': Counter(),\n",
|
|
||||||
" 'W': Counter(),\n",
|
|
||||||
" 'X': Counter({'AEGLMPX': 15}),\n",
|
|
||||||
" 'Y': Counter(),\n",
|
|
||||||
" 'Z': Counter()}"
|
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
"execution_count": 12,
|
"execution_count": 13,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"output_type": "execute_result"
|
"output_type": "execute_result"
|
||||||
}
|
}
|
||||||
@ -333,70 +370,80 @@
|
|||||||
"scoring_table(words)"
|
"scoring_table(words)"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": 13,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [
|
|
||||||
{
|
|
||||||
"data": {
|
|
||||||
"text/plain": [
|
|
||||||
"{'GLAM': 'AGLM',\n",
|
|
||||||
" 'AMALGAM': 'AGLM',\n",
|
|
||||||
" 'MEGAPLEX': 'AEGLMPX',\n",
|
|
||||||
" 'PELAGIC': 'ACEGILP',\n",
|
|
||||||
" 'GAME': 'AEGM',\n",
|
|
||||||
" 'MAPLE': 'AELMP'}"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
"execution_count": 13,
|
|
||||||
"metadata": {},
|
|
||||||
"output_type": "execute_result"
|
|
||||||
}
|
|
||||||
],
|
|
||||||
"source": [
|
|
||||||
"{w: letterset(w) for w in words} # I repeat this here just to remind us of the `words`"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
{
|
||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"Let's explain what `scoring_table(words)` produces. With the center letter being `'A'`, there are five entries:\n",
|
"Note the letterset\n",
|
||||||
"`'AGLM'` scores 8 (7 for `'AMALGAM'` and 1 for `'GLAM'`), `'AEGLMPX'` scores 15 for `'MEGAPLEX'`, and so on. There are separate entries for each center letter; for example the center letter `'X'` has only one word, `'MEGAPLEX'`; it has letterset `'AEGLMPX'` and scores 15 points.\n",
|
"`'AGLM'` scores 8 points as the sum over two words: 7 for `'AMALGAM'` and 1 for `'GLAM'`. \n",
|
||||||
"\n",
|
"\n",
|
||||||
"# Computing the Game Score\n",
|
"The following says that there are about twice as many words as lettersets: on average about two words have the same letterset.\n",
|
||||||
"\n",
|
"\n"
|
||||||
"Given a honeycomb expressed as a (letterset, center letter) pair, we can efficiently compute the `game_score` (the total score of all words that can be made) using the precomputed scoring table:\n",
|
|
||||||
"- First we get the \"row\" of the table defined by the center letter.\n",
|
|
||||||
"- Next we consider every possible *subset* of the letters in the honeycomb. There are 7 letters in a honeycomb, but we must always include the central letter, so we're really asking for how many subsets there are of 6 letters, and the answer is $2^6 = 64$.\n",
|
|
||||||
"- We compute the subsets of letters with `letter_subsets`, fetch the 64 entries in the row, and add them up.\n"
|
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": 14,
|
"execution_count": 14,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [
|
||||||
|
{
|
||||||
|
"data": {
|
||||||
|
"text/plain": [
|
||||||
|
"2.058307557361156"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"execution_count": 14,
|
||||||
|
"metadata": {},
|
||||||
|
"output_type": "execute_result"
|
||||||
|
}
|
||||||
|
],
|
||||||
"source": [
|
"source": [
|
||||||
"def game_score(letters, center, table):\n",
|
"len(enable1) / len(scoring_table(enable1))"
|
||||||
" \"The total score by this honeycomb (i.e., letters/center pair).\"\n",
|
]
|
||||||
" row = table[center]\n",
|
},
|
||||||
" subsets = letter_subsets(letters, center)\n",
|
{
|
||||||
" return sum(row[s] for s in subsets)\n",
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"# Computing the Game Score\n",
|
||||||
"\n",
|
"\n",
|
||||||
"def letter_subsets(letters, center):\n",
|
"Given a honeycomb we can efficiently compute the `game_score` as follows:\n",
|
||||||
" \"\"\"All subsets of `letters` that contain `center` letter.\"\"\"\n",
|
"- For each of the 103,187 honeycombs, I could look at every posssible word to see if it can be made. But there are 44,585 words. Infeasible.\n",
|
||||||
" return [letterset(subset) \n",
|
"- Instead, generate every possible *subset* of the letters in the honeycomb. A subset must include the central letter, and it may or may not include each of the other 6 letters, so there are $2^6 = 64$ subsets. The function `letter_subsets` returns these.\n",
|
||||||
" for n in range(1, 8) \n",
|
"- We already have letterset scores in the scoring table, so just fetch the 64 entries in the scoring table and add them up.\n",
|
||||||
" for subset in combinations(letters, n) \n",
|
"- 64 is less than 44,585, so that's a nice optimization!\n"
|
||||||
" if center in subset]"
|
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": 15,
|
"execution_count": 15,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"def game_score(letters, center, table) -> int:\n",
|
||||||
|
" \"The total score for this honeycomb, given a scoring table.\"\n",
|
||||||
|
" subsets = letter_subsets(letters, center)\n",
|
||||||
|
" return sum(table[s] for s in subsets)\n",
|
||||||
|
"\n",
|
||||||
|
"def letter_subsets(letters, center) -> list:\n",
|
||||||
|
" \"\"\"All subsets of `letters` that contain the letter `center`.\"\"\"\n",
|
||||||
|
" return [letterset(subset) \n",
|
||||||
|
" for n in range(1, 8) \n",
|
||||||
|
" for subset in combinations(letters, n)\n",
|
||||||
|
" if center in subset]"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"Trying out `letter_subsets`:"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 16,
|
||||||
|
"metadata": {},
|
||||||
"outputs": [
|
"outputs": [
|
||||||
{
|
{
|
||||||
"data": {
|
"data": {
|
||||||
@ -404,18 +451,18 @@
|
|||||||
"64"
|
"64"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
"execution_count": 15,
|
"execution_count": 16,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"output_type": "execute_result"
|
"output_type": "execute_result"
|
||||||
}
|
}
|
||||||
],
|
],
|
||||||
"source": [
|
"source": [
|
||||||
"len(letter_subsets('AEGLMPX', 'G'))"
|
"len(letter_subsets('ABCDEFG', 'C')) # It will always be 64, for any honeycomb"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": 16,
|
"execution_count": 17,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [
|
"outputs": [
|
||||||
{
|
{
|
||||||
@ -424,25 +471,25 @@
|
|||||||
"['C', 'AC', 'BC', 'CD', 'ABC', 'ACD', 'BCD', 'ABCD']"
|
"['C', 'AC', 'BC', 'CD', 'ABC', 'ACD', 'BCD', 'ABCD']"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
"execution_count": 16,
|
"execution_count": 17,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"output_type": "execute_result"
|
"output_type": "execute_result"
|
||||||
}
|
}
|
||||||
],
|
],
|
||||||
"source": [
|
"source": [
|
||||||
"letter_subsets('ABCD', 'C') # An example of `letter_subsets` computation."
|
"letter_subsets('ABCD', 'C') # A smaller example gives 2**3 = 8 subsets"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"Now we can compute a game score for a honeycomb:"
|
"Trying `game_score`:"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": 17,
|
"execution_count": 18,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [
|
"outputs": [
|
||||||
{
|
{
|
||||||
@ -451,7 +498,7 @@
|
|||||||
"24"
|
"24"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
"execution_count": 17,
|
"execution_count": 18,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"output_type": "execute_result"
|
"output_type": "execute_result"
|
||||||
}
|
}
|
||||||
@ -462,7 +509,7 @@
|
|||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": 18,
|
"execution_count": 19,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [
|
"outputs": [
|
||||||
{
|
{
|
||||||
@ -471,7 +518,7 @@
|
|||||||
"153"
|
"153"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
"execution_count": 18,
|
"execution_count": 19,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"output_type": "execute_result"
|
"output_type": "execute_result"
|
||||||
}
|
}
|
||||||
@ -487,34 +534,34 @@
|
|||||||
"# The Solution: The Best Honeycomb\n",
|
"# The Solution: The Best Honeycomb\n",
|
||||||
"\n",
|
"\n",
|
||||||
"\n",
|
"\n",
|
||||||
"Here's the function that will give us the solution. `best_honeycomb` searches through every possible pangram (and center) and finds the combination that gives the honeycomb with the highest game score:"
|
"Finally, here's the function that will give us the solution: `best_honeycomb` searches through every possible pangram and center and finds the combination that gives the highest game score:"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": 19,
|
"execution_count": 20,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"def best_honeycomb(words) -> tuple: \n",
|
"def best_honeycomb(words) -> tuple: \n",
|
||||||
" \"\"\"Return (score, letters, center) for the honeycomb with highest score on these words.\"\"\"\n",
|
" \"\"\"Return (score, letters, center) for the honeycomb with highest score on these words.\"\"\"\n",
|
||||||
" table = scoring_table(words)\n",
|
" table = scoring_table(words)\n",
|
||||||
|
" pangrams = {s for s in table if len(s) == 7}\n",
|
||||||
" return max([game_score(pangram, center, table), pangram, center]\n",
|
" return max([game_score(pangram, center, table), pangram, center]\n",
|
||||||
" for center in alphabet\n",
|
" for pangram in pangrams\n",
|
||||||
" for pangram in table[center] \n",
|
" for center in pangram)"
|
||||||
" if len(pangram) == 7)"
|
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"First the solution for the small `words` list:"
|
"First the solution for the tiny `words` list:"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": 20,
|
"execution_count": 21,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [
|
"outputs": [
|
||||||
{
|
{
|
||||||
@ -523,7 +570,7 @@
|
|||||||
"[29, 'AEGLMPX', 'M']"
|
"[29, 'AEGLMPX', 'M']"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
"execution_count": 20,
|
"execution_count": 21,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"output_type": "execute_result"
|
"output_type": "execute_result"
|
||||||
}
|
}
|
||||||
@ -536,20 +583,20 @@
|
|||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"Now the solution for the big `enable1` word list:"
|
"Now the solution for the problem that The Riddler posed, the big `enable1` word list:"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": 21,
|
"execution_count": 22,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [
|
"outputs": [
|
||||||
{
|
{
|
||||||
"name": "stdout",
|
"name": "stdout",
|
||||||
"output_type": "stream",
|
"output_type": "stream",
|
||||||
"text": [
|
"text": [
|
||||||
"CPU times: user 4.35 s, sys: 6.58 ms, total: 4.36 s\n",
|
"CPU times: user 4.15 s, sys: 4.46 ms, total: 4.16 s\n",
|
||||||
"Wall time: 4.36 s\n"
|
"Wall time: 4.16 s\n"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@ -558,7 +605,7 @@
|
|||||||
"[3898, 'AEGINRT', 'R']"
|
"[3898, 'AEGINRT', 'R']"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
"execution_count": 21,
|
"execution_count": 22,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"output_type": "execute_result"
|
"output_type": "execute_result"
|
||||||
}
|
}
|
||||||
@ -573,12 +620,12 @@
|
|||||||
"source": [
|
"source": [
|
||||||
"**Wow. 3898** is a high score! And it took less than 5 seconds to find it.\n",
|
"**Wow. 3898** is a high score! And it took less than 5 seconds to find it.\n",
|
||||||
"\n",
|
"\n",
|
||||||
"Still, this is a bit unsatisfying. I'd like to see the actual words, not just the score. If I had designed my program to be modular rather than to be efficient, I'd already have that. But as is, I need to somewhat repeat myself to create this report:"
|
"However, I'd like to see the actual words in addition to the score. If I had designed my program to be modular rather than to be efficient, that would be trivial. But as is, I need to define a new function, `scoring_words`, before I can create such a report:"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": 22,
|
"execution_count": 23,
|
||||||
"metadata": {
|
"metadata": {
|
||||||
"scrolled": false
|
"scrolled": false
|
||||||
},
|
},
|
||||||
@ -588,14 +635,17 @@
|
|||||||
" \"\"\"Print stats and word scores for the best honeycomb on these words.\"\"\"\n",
|
" \"\"\"Print stats and word scores for the best honeycomb on these words.\"\"\"\n",
|
||||||
" (score, letters, center) = best_honeycomb(words)\n",
|
" (score, letters, center) = best_honeycomb(words)\n",
|
||||||
" sw = scoring_words(letters, center, words)\n",
|
" sw = scoring_words(letters, center, words)\n",
|
||||||
" print(f'For the word list of {len(words)} words, the highest-scoring honeycomb is:')\n",
|
" top = max(sw, key=word_score)\n",
|
||||||
" print(f' {letters} with center {center}')\n",
|
" np = sum(map(is_pangram, sw))\n",
|
||||||
" print(f'It makes {len(sw)} words (for {score} points)',\n",
|
" print(f'''\n",
|
||||||
" f'with {sum(map(is_pangram, sw))} pangrams (*).\\n')\n",
|
" The highest-scoring honeycomb for this list of {len(words)} words is:\n",
|
||||||
|
" {letters} (center {center})\n",
|
||||||
|
" It scores {score} points on {len(sw)} words with {np} pangrams*\n",
|
||||||
|
" The top scoring word is {top} for {word_score(top)} points.\\n''')\n",
|
||||||
" for w in sorted(sw):\n",
|
" for w in sorted(sw):\n",
|
||||||
" print(f'{w} ({word_score(w)}) {\"*\" if is_pangram(w) else \"\"}')\n",
|
" print(f'{w} ({word_score(w)}) {\"*\" if is_pangram(w) else \"\"}')\n",
|
||||||
" \n",
|
" \n",
|
||||||
"def scoring_words(letters, center, words):\n",
|
"def scoring_words(letters, center, words) -> set:\n",
|
||||||
" \"\"\"What words can this honeycomb make?\"\"\"\n",
|
" \"\"\"What words can this honeycomb make?\"\"\"\n",
|
||||||
" subsets = letter_subsets(letters, center)\n",
|
" subsets = letter_subsets(letters, center)\n",
|
||||||
" return {w for w in words if letterset(w) in subsets}"
|
" return {w for w in words if letterset(w) in subsets}"
|
||||||
@ -603,16 +653,18 @@
|
|||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": 23,
|
"execution_count": 24,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [
|
"outputs": [
|
||||||
{
|
{
|
||||||
"name": "stdout",
|
"name": "stdout",
|
||||||
"output_type": "stream",
|
"output_type": "stream",
|
||||||
"text": [
|
"text": [
|
||||||
"For the word list of 6 words, the highest-scoring honeycomb is:\n",
|
"\n",
|
||||||
" AEGLMPX with center M\n",
|
" The highest-scoring honeycomb for this list of 6 words is:\n",
|
||||||
"It makes 5 words (for 29 points) with 1 pangrams (*).\n",
|
" AEGLMPX (center M)\n",
|
||||||
|
" It scores 29 points on 5 words with 1 pangrams*\n",
|
||||||
|
" The top scoring word is MEGAPLEX for 15 points.\n",
|
||||||
"\n",
|
"\n",
|
||||||
"AMALGAM (7) \n",
|
"AMALGAM (7) \n",
|
||||||
"GAME (1) \n",
|
"GAME (1) \n",
|
||||||
@ -628,7 +680,7 @@
|
|||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": 24,
|
"execution_count": 25,
|
||||||
"metadata": {
|
"metadata": {
|
||||||
"scrolled": false
|
"scrolled": false
|
||||||
},
|
},
|
||||||
@ -637,9 +689,11 @@
|
|||||||
"name": "stdout",
|
"name": "stdout",
|
||||||
"output_type": "stream",
|
"output_type": "stream",
|
||||||
"text": [
|
"text": [
|
||||||
"For the word list of 44585 words, the highest-scoring honeycomb is:\n",
|
"\n",
|
||||||
" AEGINRT with center R\n",
|
" The highest-scoring honeycomb for this list of 44585 words is:\n",
|
||||||
"It makes 537 words (for 3898 points) with 50 pangrams (*).\n",
|
" AEGINRT (center R)\n",
|
||||||
|
" It scores 3898 points on 537 words with 50 pangrams*\n",
|
||||||
|
" The top scoring word is REINTEGRATING for 20 points.\n",
|
||||||
"\n",
|
"\n",
|
||||||
"AERATE (6) \n",
|
"AERATE (6) \n",
|
||||||
"AERATING (15) *\n",
|
"AERATING (15) *\n",
|
||||||
|
Loading…
Reference in New Issue
Block a user