\n",
"\n",
"# Spelling Bee Puzzle\n",
"\n",
"The [3 Jan. 2020 edition of the 538 Riddler](https://fivethirtyeight.com/features/can-you-solve-the-vexing-vexillology/) concerns the popular NYTimes [Spelling Bee](https://www.nytimes.com/puzzles/spelling-bee) puzzle:\n",
"\n",
"> In this game, seven letters are arranged in a **honeycomb** lattice, with one letter in the center. Here’s the lattice from Dec. 24, 2019:\n",
"> \n",
"> \n",
"> \n",
"> The goal is to identify as many words as possible that meet the following criteria:\n",
"> 1. The word must be at least four letters long.\n",
"> 2. The word must include the central letter.\n",
"> 3. The word cannot include any letter beyond the seven given letters.\n",
">\n",
">Note that letters can be repeated. For example, the words GAME and AMALGAM are both acceptable words. Four-letter words are worth 1 point each, while five-letter words are worth 5 points, six-letter words are worth 6 points, seven-letter words are worth 7 points, etc. Words that use all of the seven letters in the honeycomb are known as **pangrams** and earn 7 bonus points (in addition to the points for the length of the word). So in the above example, MEGAPLEX is worth 15 points.\n",
">\n",
"> ***Which seven-letter honeycomb results in the highest possible game score?*** To be a valid choice of seven letters, no letter can be repeated, it must not contain the letter S (that would be too easy) and there must be at least one pangram.\n",
">\n",
"> For consistency, please use [this word list](https://norvig.com/ngrams/enable1.txt) to check your game score.\n",
"\n",
"\n",
"\n",
"Since the referenced [word list](https://norvig.com/ngrams/enable1.txt) came from *my* web site, I felt somewhat compelled to solve this one. (Note I didn't make up the word list; it is a standard Scrabble word list that I happen to host a copy of.) I'll show you how I address the problem, step by step:"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Step 1: Words, Word Scores, and Pangrams\n",
"\n",
"Let's start by defining some basics:\n",
"\n",
"- A **valid word** is a string of at least 4 letters, with no 'S', and not more than 7 distinct letters.\n",
"- A **word list** is, well, a list of words.\n",
"- A **pangram** is a word with exactly 7 distinct letters; it scores a **pangram bonus** of 7 points.\n",
"- The **word score** is 1 for a four letter word, or the length of the word for longer words, plus any pangram bonus.\n"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"from typing import List, Set, Tuple, Dict\n",
"from collections import Counter, defaultdict, namedtuple\n",
"from itertools import combinations\n",
"import matplotlib.pyplot as plt"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"Word = str # Type for a word\n",
"\n",
"def valid(word) -> bool:\n",
" \"\"\"Does word have at least 4 letters, no 'S', and no more than 7 distinct letters?\"\"\"\n",
" return len(word) >= 4 and 'S' not in word and len(set(word)) <= 7\n",
"\n",
"def valid_words(text, valid=valid) -> List[Word]: \n",
" \"\"\"All the valid words in text.\"\"\"\n",
" return [w for w in text.upper().split() if valid(w)]\n",
"\n",
"def pangram_bonus(word) -> int: \n",
" \"\"\"Does a word get a bonus for having 7 distinct letters?\"\"\"\n",
" return 7 if len(set(word)) == 7 else 0\n",
"\n",
"def word_score(word) -> int: \n",
" \"\"\"The points for this word, including bonus for pangram.\"\"\"\n",
" return 1 if len(word) == 4 else len(word) + pangram_bonus(word)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"I'll make a mini word list to experiment with: "
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"['GAME', 'AMALGAM', 'GLAM', 'MEGAPLEX', 'CACCIATORE', 'EROTICA']"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"mini = valid_words('game amalgam amalgamation glam gem gems em megaplex cacciatore erotica')\n",
"mini"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Note that `gem` and `em` are too short, `gems` has an `s` which is not allowed, and `amalgamation` has too many distinct letters (8). We're left with six valid words out of the ten candidate words. Here are examples of the other two functions in action:"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'CACCIATORE', 'EROTICA', 'MEGAPLEX'}"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"{w for w in mini if pangram_bonus(w)}"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'GAME': 1,\n",
" 'AMALGAM': 7,\n",
" 'GLAM': 1,\n",
" 'MEGAPLEX': 15,\n",
" 'CACCIATORE': 17,\n",
" 'EROTICA': 14}"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"{w: word_score(w) for w in mini}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Step 2: Honeycombs and Game Scores\n",
"\n",
"In a honeycomb the order of the letters doesn't matter; all that matters is:\n",
" 1. The seven distinct letters in the honeycomb.\n",
" 2. The one distinguished center letter.\n",
" \n",
"Thus, we can represent a honeycomb as follows (I wanted to put in my own less verbose `__repr__` method):\n",
" "
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Honeycomb('AEGLMPX', 'G')"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"class Honeycomb(namedtuple('_', 'letters, center')):\n",
" def __repr__(self): return f'Honeycomb({self.letters!r}, {self.center!r})'\n",
"\n",
"hc = Honeycomb('AEGLMPX', 'G')\n",
"hc"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The **game score** for a honeycomb is the sum of the word scores for all the words that the honeycomb can make. How do we know if a honeycomb can make a word? It can if (1) the word contains the honeycomb's center and (2) every letter in the word is in the honeycomb. "
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"def game_score(honeycomb, wordlist) -> int:\n",
" \"\"\"The total score for this honeycomb.\"\"\"\n",
" return sum(word_score(w) \n",
" for w in wordlist if can_make(honeycomb, w))\n",
"\n",
"def can_make(honeycomb, word) -> bool:\n",
" \"\"\"Can the honeycomb make this word?\"\"\"\n",
" letters, center = honeycomb\n",
" return center in word and all(L in letters for L in word)"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"24"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"game_score(hc, mini)"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'GAME': 1, 'AMALGAM': 7, 'GLAM': 1, 'MEGAPLEX': 15}"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"{w: word_score(w) for w in mini if can_make(hc, w)}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Step 3: Best Honeycomb\n",
"\n",
"\n",
"How many possible honeycombs are there? We can put any letter in the center, then any 6 letters around the outside (order doesn't matter); since the letter 'S' is not allowed, this gives a total of 25 × (24 choose 6) = 3,364,900 possible honeycombs. We could conceivably ask for the game score of every one of them and pick the best; that would probably take hours of computation (not seconds, and not days).\n",
"\n",
"However, a key constraint of the game is that **there must be at least one pangram** in the set of words that a valid honeycomb can make. That means that a valid honeycomb must ***be*** the set of seven letters in one of the pangram words in the word list, with any of the seven letters as the center. My approach to find the best (highest scoring) honeycomb is:\n",
"\n",
" * Go through all the words and find all the valid honeycombs: the 7-letter pangram letter sets, with any of the 7 letters as center.\n",
" * Compute the game score for each valid honeycomb and return a honeycomb with maximal game score."
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [],
"source": [
"def best_honeycomb(words) -> Honeycomb: \n",
" \"\"\"Return a honeycomb with highest game score on these words.\"\"\"\n",
" return max(valid_honeycombs(words), \n",
" key=lambda h: game_score(h, words))\n",
"\n",
"def valid_honeycombs(words) -> List[Honeycomb]:\n",
" \"\"\"Valid Honeycombs are the pangram lettersets, with any center.\"\"\"\n",
" pangram_lettersets = {letterset(w) for w in words if pangram_bonus(w)}\n",
" return [Honeycomb(letters, center) \n",
" for letters in pangram_lettersets \n",
" for center in letters]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"I will represent a **set of letters** as a sorted string of distinct letters. Why not a Python `set` (or `frozenset` if we want it to be the key of a dict)? Because a string takes up less space in memory, and its printed representation is easier to read when debugging. Compare:\n",
"- `frozenset({'A', 'E', 'G', 'L', 'M', 'P', 'X'})`\n",
"- `'AEGLMPX'`\n",
"\n",
"I'll use the name `letterset` for the function that converts a word to a set of letters, and `Letterset` for the resulting type:"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [],
"source": [
"Letterset = str # Type for a set of letters, like \"AGLM\"\n",
"\n",
"def letterset(word) -> Letterset:\n",
" \"\"\"The set of letters in a word, represented as a sorted str.\"\"\"\n",
" return ''.join(sorted(set(word)))"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'GAME': 'AEGM',\n",
" 'AMALGAM': 'AGLM',\n",
" 'GLAM': 'AGLM',\n",
" 'MEGAPLEX': 'AEGLMPX',\n",
" 'CACCIATORE': 'ACEIORT',\n",
" 'EROTICA': 'ACEIORT'}"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"{w: letterset(w) for w in mini}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Note that 'AMALGAM' and 'GLAM' have the same letterset, as do 'CACCIATORE' and 'EROTICA'."
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[Honeycomb('ACEIORT', 'A'),\n",
" Honeycomb('ACEIORT', 'C'),\n",
" Honeycomb('ACEIORT', 'E'),\n",
" Honeycomb('ACEIORT', 'I'),\n",
" Honeycomb('ACEIORT', 'O'),\n",
" Honeycomb('ACEIORT', 'R'),\n",
" Honeycomb('ACEIORT', 'T'),\n",
" Honeycomb('AEGLMPX', 'A'),\n",
" Honeycomb('AEGLMPX', 'E'),\n",
" Honeycomb('AEGLMPX', 'G'),\n",
" Honeycomb('AEGLMPX', 'L'),\n",
" Honeycomb('AEGLMPX', 'M'),\n",
" Honeycomb('AEGLMPX', 'P'),\n",
" Honeycomb('AEGLMPX', 'X')]"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"valid_honeycombs(mini)"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Honeycomb('ACEIORT', 'A')"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"best_honeycomb(mini)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**We're done!** We know how to find the best honeycomb. But so far, we've only done it for the mini word list. \n",
"\n",
"# Step 4: The enable1 Word List\n",
"\n",
"Here's the real word list, `enable1.txt`, and some counts derived from it:"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" 172820 enable1.txt\r\n"
]
}
],
"source": [
"! [ -e enable1.txt ] || curl -O http://norvig.com/ngrams/enable1.txt\n",
"! wc -w enable1.txt"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"44585"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"enable1 = valid_words(open('enable1.txt').read())\n",
"len(enable1)"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"14741"
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"pangrams = [w for w in enable1 if pangram_bonus(w)]\n",
"len(pangrams)"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"7986"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"len({letterset(w) for w in pangrams}) # pangram lettersets"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"55902"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"len(valid_honeycombs(enable1))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To summarize, there are:\n",
"\n",
"- 172,820 words in the `enable1` word list\n",
"- 44,585 valid Spelling Bee words\n",
"- 14,741 pangram words \n",
"- 7,986 distinct pangram lettersets\n",
"- 55,902 (7 × 7,986) valid pangram-containing honeycombs\n",
"\n",
"How long will it take to run `best_honeycomb(enable1)`? Most of the computation time is in `game_score` (which has to look at all 44,585 valid words), so let's estimate the total time by first checking how long it takes to compute the game score of a single honeycomb:"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"CPU times: user 9.34 ms, sys: 31 µs, total: 9.37 ms\n",
"Wall time: 9.36 ms\n"
]
},
{
"data": {
"text/plain": [
"153"
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%time game_score(hc, enable1)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Roughly 10 milliseconds on my computer (this may vary). How many minutes would it be to run `game_score` for all 55,902 valid honeycombs?"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"9.317"
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"55902 * 10/1000 / 60"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"About 9 or 10 minutes. I could run `best_honeycomb(enable1)` right now and take a coffee break until it completes, but I think that a puzzle like this deserves a more elegant solution. I'd like to get the run time under a minute (as is suggested in [Project Euler](https://projecteuler.net/)), and I have an idea how to do it.\n",
"\n",
"# Step 5: Faster Algorithm: Points Table\n",
"\n",
"Here's my plan for a more efficient program:\n",
"\n",
"1. Keep the same strategy of trying every pangram letterset, but do some precomputation that will make `game_score` much faster.\n",
"1. The precomputation is: compute the `letterset` and `word_score` for each word, and make a table of `{letterset: total_points}` giving the total number of word score points for all the words that correspond to each letterset. I call this a **points table**.\n",
"3. These calculations are independent of the honeycomb, so they need to be done only once, not 55,902 times. \n",
"4. `game_score2` (the name is changed because the interface has changed) takes a honeycomb and a points table as input. The idea is that every word that the honeycomb can make must have a letterset that is the same as a valid **letter subset** of the honeycomb. A valid letter subset must include the center letter, and it may or may not include each of the other 6 letters, so there are exactly $2^6 = 64$ valid letter subsets. (The function `letter_subsets(honeycomb)` computes these.)\n",
"The result of `game_score2` is the sum of the honeycomb's 64 letter subset entries in the points table.\n",
"\n",
"\n",
"That means that in `game_score2` we no longer need to iterate over 44,585 words and check if each word is a subset of the honeycomb. Instead we iterate over the 64 subsets of the honeycomb and for each one check—in one table lookup—whether it is a word (or more than word) and how many total points those word(s) score. Since 64 < 44,585, that's a nice optimization!\n",
"\n",
"\n",
"Here's the code:"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [],
"source": [
"PointsTable = Dict[Letterset, int]\n",
"\n",
"def best_honeycomb(words) -> Honeycomb: \n",
" \"\"\"Return a honeycomb with highest game score on these words.\"\"\"\n",
" points_table = tabulate_points(words)\n",
" honeycombs = (Honeycomb(letters, center) \n",
" for letters in points_table if len(letters) == 7 \n",
" for center in letters)\n",
" return max(honeycombs, key=lambda h: game_score2(h, points_table))\n",
"\n",
"def tabulate_points(words) -> PointsTable:\n",
" \"\"\"Return a Counter of {letterset: points} from words.\"\"\"\n",
" table = Counter()\n",
" for w in words:\n",
" table[letterset(w)] += word_score(w)\n",
" return table\n",
"\n",
"def letter_subsets(honeycomb) -> List[Letterset]:\n",
" \"\"\"The 64 subsets of the letters in the honeycomb, always including the center letter.\"\"\"\n",
" return [letters \n",
" for n in range(1, 8) \n",
" for letters in map(''.join, combinations(honeycomb.letters, n))\n",
" if honeycomb.center in letters]\n",
"\n",
"def game_score2(honeycomb, points_table) -> int:\n",
" \"\"\"The total score for this honeycomb, using a points table.\"\"\"\n",
" return sum(points_table[letterset] for letterset in letter_subsets(honeycomb))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's get a feel for how this works. \n",
"\n",
"First `letter_subsets` (a 4-letter honeycomb makes $2^3 = 8$ subsets; 7-letter honeycombs make $2^6 = 64$):"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"['G', 'GL', 'GA', 'GM', 'GLA', 'GLM', 'GAM', 'GLAM']"
]
},
"execution_count": 23,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"letter_subsets(Honeycomb('GLAM', 'G')) "
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"['GAME', 'AMALGAM', 'GLAM', 'MEGAPLEX', 'CACCIATORE', 'EROTICA']"
]
},
"execution_count": 24,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"mini # Remind me again what the mini word list is?"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now `tabulate_points`:"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Counter({'AEGM': 1, 'AGLM': 8, 'AEGLMPX': 15, 'ACEIORT': 31})"
]
},
"execution_count": 25,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"tabulate_points(mini)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The letterset `'AGLM'` gets 8 points, 7 for AMALGAM and 1 for GLAM. `'ACEIORT'` gets 31 points, 17 for CACCIATORE and 14 for EROTICA. The other lettersets represent one word each. \n",
"\n",
"Let's make sure we haven't broken the `best_honeycomb` function:"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [],
"source": [
"assert best_honeycomb(mini) == Honeycomb('ACEIORT', 'A')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Step 6: The Solution"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Finally, the solution to the puzzle:"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"CPU times: user 1.71 s, sys: 1.2 ms, total: 1.71 s\n",
"Wall time: 1.71 s\n"
]
}
],
"source": [
"%time best = best_honeycomb(enable1)"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(Honeycomb('AEGINRT', 'R'), 3898)"
]
},
"execution_count": 28,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"best, game_score(best, enable1)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Wow! 3898 is a high score!** \n",
"\n",
"And it took less than 2 seconds of computation to find the best honeycomb!"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Step 7: Even Faster Algorithm: Branch and Bound\n",
"\n",
"A run time of 2 seconds is pretty good! But what if the word list were 100 times bigger? What if a honeycomb had 12 letters around the outside, not just 6? We might still be looking for ideas to speed up the computation. I happen to have one.\n",
"\n",
"Consider the word 'EQUIVOKE'. It is a pangram, but what with the 'Q' and 'V' and 'K', it is not a high-scoring honeycomb, regardless of what center is used:"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'E': 48, 'Q': 29, 'U': 29, 'I': 32, 'V': 35, 'O': 36, 'K': 34}"
]
},
"execution_count": 29,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"{C: game_score(Honeycomb('EIKOQUV', C), enable1)\n",
" for C in 'EQUIVOKE'}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"It would be great if we could eliminate all seven of these honeycombs at once, rather than trying each one in turn. So my idea is to:\n",
"- Keep track of the best honeycomb and best score found so far.\n",
"- For each new pangram letterset, ask \"if we weren't required to use the center letter, would this letterset score higher than the best honeycomb so far?\" \n",
"- If yes, then try it with all seven centers; if not then discard it immediately.\n",
"- This is called a [**branch and bound**](https://en.wikipedia.org/wiki/Branch_and_bound) algorithm: if an **upper bound** of the new letterset's score can't beat the best honeycomb so far, then we prune a whole **branch** of the search tree consisting of the seven honeycombs that have that letterset.\n",
"\n",
"What would the score of a letterset be if we weren't required to use the center letter? It turns out I can make a dummy Honeycomb and specify the empty string for the center, `Honeycomb(letters, '')`, and call `game_score2` on that. This works because of a quirk of Python: we ask if `honeycomb.center in letters`; normally in Python the expression `x in y` means \"is `x` a member of the collection `y`\", but when `y` is a string it means \"is `x` a substring of `y`\", and the empty string is a substring of every string. (If I had represented a letterset as a Python `set`, this wouldn't work.)\n",
"\n",
"Thus, I can rewrite `best_honeycomb` as follows:"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {},
"outputs": [],
"source": [
"def best_honeycomb2(words) -> Honeycomb: \n",
" \"\"\"Return a honeycomb with highest game score on these words.\"\"\"\n",
" points_table = tabulate_points(words)\n",
" best, best_score = None, 0\n",
" pangrams = (s for s in points_table if len(s) == 7)\n",
" for p in pangrams:\n",
" if game_score2(Honeycomb(p, ''), points_table) > best_score:\n",
" for center in p:\n",
" honeycomb = Honeycomb(p, center)\n",
" score = game_score2(honeycomb, points_table)\n",
" if score > best_score:\n",
" best, best_score = honeycomb, score\n",
" return best"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"CPU times: user 393 ms, sys: 1.31 ms, total: 395 ms\n",
"Wall time: 394 ms\n"
]
},
{
"data": {
"text/plain": [
"Honeycomb('AEGINRT', 'R')"
]
},
"execution_count": 31,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%time best_honeycomb2(enable1)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Same honeycomb for the answer, but four times faster—less than half a second.\n",
"\n",
"# Step 8: Curiosity\n",
"\n",
"I'm curious about a bunch of things.\n",
"\n",
"### What's the highest-scoring individual word?"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'ANTITOTALITARIAN'"
]
},
"execution_count": 32,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"max(enable1, key=word_score)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### What are some of the pangrams?"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"['AARDWOLF',\n",
" 'ANCIENTER',\n",
" 'BABBLEMENT',\n",
" 'BIVARIATE',\n",
" 'CABEZON',\n",
" 'CHEERFUL',\n",
" 'COLLOGUING',\n",
" 'CRANKLE',\n",
" 'DEMERGERING',\n",
" 'DWELLING',\n",
" 'ETYMOLOGY',\n",
" 'FLATTING',\n",
" 'GARROTTING',\n",
" 'HANDIER',\n",
" 'IDENTIFY',\n",
" 'INTERVIEWER',\n",
" 'LARVICIDAL',\n",
" 'MANDRAGORA',\n",
" 'MORTGAGEE',\n",
" 'NOTABLE',\n",
" 'OVERHELD',\n",
" 'PERONEAL',\n",
" 'PRAWNED',\n",
" 'QUILTER',\n",
" 'REINITIATED',\n",
" 'TABLEFUL',\n",
" 'TOWHEAD',\n",
" 'UNCHURCHLY',\n",
" 'UTOPIAN',\n",
" 'WINDAGE']"
]
},
"execution_count": 33,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"pangrams[::500] # Every five-hundreth pangram"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### What's the breakdown of reasons why words are invalid?"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Counter({'< 4': 922, 'valid': 44585, 'has S': 103913, '> 7': 23400})"
]
},
"execution_count": 34,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"Counter('has S' if 'S' in w else \n",
" '< 4' if len(w) < 4 else \n",
" '> 7' if len(set(w)) > 7 else \n",
" 'valid'\n",
" for w in valid_words(open('enable1.txt').read(), lambda w: True))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"There are more than twice as many words with an 'S' as there are valid words.\n",
"\n",
"### About the points table: How many different letter subsets are there? "
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"21661"
]
},
"execution_count": 35,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"pts = tabulate_points(enable1)\n",
"len(pts)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"That means there's about two valid words for each letterset.\n",
"\n",
"### Which letter subsets score the most?"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[('AEGINRT', 832),\n",
" ('ADEGINR', 486),\n",
" ('ACILNOT', 470),\n",
" ('ACEINRT', 465),\n",
" ('CEINORT', 398),\n",
" ('AEGILNT', 392),\n",
" ('AGINORT', 380),\n",
" ('ADEINRT', 318),\n",
" ('CENORTU', 318),\n",
" ('ACDEIRT', 307)]"
]
},
"execution_count": 36,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"pts.most_common(10)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The best honeycomb, `'AEGINRT`, is also the highest scoring letter subset on its own (although it only gets 832 of the 3,898 total points from using all seven letters)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### How many honeycombs does `best_honeycomb2` consider?\n",
"\n",
"We know that `best_honeycomb` considers 7,986 × 7 = 55,902 honeycombs. How many does `best_honeycomb2` consider? We can answer that by wrapping `Honeycomb` with a decorator that counts calls:"
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"8084"
]
},
"execution_count": 37,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"def call_counter(fn):\n",
" \"Return a function that calls fn, and increments a counter on each call.\"\n",
" def wrapped(*args, **kwds):\n",
" wrapped.call_counter += 1\n",
" return fn(*args, **kwds)\n",
" wrapped.call_counter = 0\n",
" return wrapped\n",
" \n",
"Honeycomb = call_counter(Honeycomb)\n",
"\n",
"best = best_honeycomb2(enable1)\n",
"Honeycomb.call_counter"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Only 8,084 honeycombs are considered. That means that most pangrams are only considered once; for only 14 pangrams do we consider all seven centers."
]
},
{
"cell_type": "code",
"execution_count": 38,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"14.0"
]
},
"execution_count": 38,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"(8084 - 7986) / 7"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Step 9: Fancy Report\n",
"\n",
"I'd like to see the actual words that each honeycomb can make, in addition to the total score, and I'm curious about how the words are divided up by letterset. Here's a function to provide such a report. I remembered that there is a `fill` function in Python (it is in the `textwrap` module) but this turned out to be a lot more complicated than I expected. I guess it is difficult to create a practical extraction and reporting tool. I feel you, [Larry Wall](http://www.wall.org/~larry/)."
]
},
{
"cell_type": "code",
"execution_count": 39,
"metadata": {},
"outputs": [],
"source": [
"from textwrap import fill\n",
"\n",
"def report(honeycomb=None, words=enable1):\n",
" \"\"\"Print stats, words, and word scores for the given honeycomb (or the best\n",
" honeycomb if no honeycomb is given) over the given word list.\"\"\"\n",
" bins = group_by(words, letterset)\n",
" adj = (\"best \" if honeycomb is None else \"\")\n",
" honeycomb = honeycomb or best_honeycomb(words)\n",
" points = game_score(honeycomb, words)\n",
" subsets = letter_subsets(honeycomb)\n",
" nwords = sum(len(bins[s]) for s in subsets)\n",
" print(f'The {adj}{honeycomb} scores {Ns(points, \"point\")} on {Ns(nwords, \"word\")}',\n",
" f'from a {len(words)} word list:\\n')\n",
" for s in sorted(subsets, key=lambda s: (-len(s), s)):\n",
" if bins[s]:\n",
" pts = sum(word_score(w) for w in bins[s])\n",
" wcount = Ns(len(bins[s]), \"pangram\" if len(s) == 7 else \"word\")\n",
" intro = f'{s:>7} {Ns(pts, \"point\"):>10} {wcount:>8} '\n",
" words = [f'{w}:{word_score(w)}' for w in sorted(bins[s])]\n",
" print(fill(' '.join(words), width=110, \n",
" initial_indent=intro, subsequent_indent=' '*8))\n",
" \n",
"def Ns(n, noun):\n",
" \"\"\"A string with `n` followed by the plural or singular of noun:\n",
" Ns(3, 'bear') => '3 bears'; Ns(1, 'world') => '1 world'\"\"\" \n",
" return f\"{n:d} {noun}{' ' if n == 1 else 's'}\"\n",
"\n",
"def group_by(items, key):\n",
" \"Group items into bins of a dict, each bin keyed by key(item).\"\n",
" bins = defaultdict(list)\n",
" for item in items:\n",
" bins[key(item)].append(item)\n",
" return bins"
]
},
{
"cell_type": "code",
"execution_count": 40,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The Honeycomb('AEGLMPX', 'G') scores 24 points on 4 words from a 6 word list:\n",
"\n",
"AEGLMPX 15 points 1 pangram MEGAPLEX:15\n",
" AEGM 1 point 1 word GAME:1\n",
" AGLM 8 points 2 words AMALGAM:7 GLAM:1\n"
]
}
],
"source": [
"report(hc, mini)"
]
},
{
"cell_type": "code",
"execution_count": 41,
"metadata": {
"scrolled": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The best Honeycomb('AEGINRT', 'R') scores 3898 points on 537 words from a 44585 word list:\n",
"\n",
"AEGINRT 832 points 50 pangrams AERATING:15 AGGREGATING:18 ARGENTINE:16 ARGENTITE:16 ENTERTAINING:19\n",
" ENTRAINING:17 ENTREATING:17 GARNIERITE:17 GARTERING:16 GENERATING:17 GNATTIER:15 GRANITE:14 GRATINE:14\n",
" GRATINEE:15 GRATINEEING:18 GREATENING:17 INGRATE:14 INGRATIATE:17 INTEGRATE:16 INTEGRATING:18\n",
" INTENERATING:19 INTERAGE:15 INTERGANG:16 INTERREGNA:17 INTREATING:17 ITERATING:16 ITINERATING:18\n",
" NATTERING:16 RATTENING:16 REAGGREGATING:20 REATTAINING:18 REGENERATING:19 REGRANTING:17 REGRATING:16\n",
" REINITIATING:19 REINTEGRATE:18 REINTEGRATING:20 REITERATING:18 RETAGGING:16 RETAINING:16\n",
" RETARGETING:18 RETEARING:16 RETRAINING:17 RETREATING:17 TANGERINE:16 TANGIER:14 TARGETING:16\n",
" TATTERING:16 TEARING:14 TREATING:15\n",
" AEGINR 270 points 35 words AGINNER:7 AGREEING:8 ANEARING:8 ANERGIA:7 ANGERING:8 ANGRIER:7 ARGININE:8 EARING:6\n",
" EARNING:7 EARRING:7 ENGRAIN:7 ENGRAINING:10 ENRAGING:8 GAINER:6 GANGRENING:10 GARNERING:9 GEARING:7\n",
" GRAINER:7 GRAINIER:8 GRANNIE:7 GREGARINE:9 NAGGIER:7 NEARING:7 RANGIER:7 REAGIN:6 REARING:7\n",
" REARRANGING:11 REEARNING:9 REENGAGING:10 REGAIN:6 REGAINER:8 REGAINING:9 REGEARING:9 REGINA:6\n",
" REGINAE:7\n",
" AEGIRT 34 points 5 words AIGRET:6 AIGRETTE:8 GAITER:6 IRRIGATE:8 TRIAGE:6\n",
" AEGNRT 94 points 13 words ARGENT:6 GARNET:6 GENERATE:8 GRANTEE:7 GRANTER:7 GREATEN:7 NEGATER:7 REAGENT:7\n",
" REGENERATE:10 REGNANT:7 REGRANT:7 TANAGER:7 TEENAGER:8\n",
" AEINRT 232 points 30 words ARENITE:7 ATTAINER:8 ENTERTAIN:9 ENTERTAINER:11 ENTRAIN:7 ENTRAINER:9 INERRANT:8\n",
" INERTIA:7 INERTIAE:8 INTENERATE:10 INTREAT:7 ITERANT:7 ITINERANT:9 ITINERATE:9 NATTIER:7 NITRATE:7\n",
" RATINE:6 REATTAIN:8 REINITIATE:10 RETAIN:6 RETAINER:8 RETINA:6 RETINAE:7 RETIRANT:8 RETRAIN:7\n",
" TERRAIN:7 TERTIAN:7 TRAINEE:7 TRAINER:7 TRIENNIA:8\n",
" AGINRT 167 points 21 words AIRTING:7 ATTIRING:8 GRANITA:7 GRANTING:8 GRATIN:6 GRATING:7 INGRATIATING:12\n",
" INTRIGANT:9 IRRIGATING:10 IRRITATING:10 NARRATING:9 NITRATING:9 RANTING:7 RATING:6 RATTING:7 TARING:6\n",
" TARRING:7 TARTING:7 TITRATING:9 TRAINING:8 TRIAGING:8\n",
" EGINRT 218 points 26 words ENGIRT:6 ENTERING:8 GETTERING:9 GITTERN:7 GREETING:8 IGNITER:7 INTEGER:7\n",
" INTERNING:9 INTERRING:9 REENTERING:10 REGREETING:10 REGRETTING:10 REIGNITE:8 REIGNITING:10\n",
" REINTERRING:11 RENTING:7 RETINTING:9 RETIRING:8 RETTING:7 RINGENT:7 TEETERING:9 TENTERING:9 TIERING:7\n",
" TITTERING:9 TREEING:7 TRIGGERING:10\n",
" AEGNR 120 points 18 words ANGER:5 ARRANGE:7 ARRANGER:8 ENGAGER:7 ENRAGE:6 GANGER:6 GANGRENE:8 GARNER:6\n",
" GENERA:6 GRANGE:6 GRANGER:7 GREENGAGE:9 NAGGER:6 RANGE:5 RANGER:6 REARRANGE:9 REENGAGE:8 REGNA:5\n",
" AEGRT 123 points 19 words AGGREGATE:9 ERGATE:6 ETAGERE:7 GARGET:6 GARRET:6 GARTER:6 GRATE:5 GRATER:6 GREAT:5\n",
" GREATER:7 REAGGREGATE:11 REGATTA:7 REGRATE:7 RETAG:5 RETARGET:8 TAGGER:6 TARGE:5 TARGET:6 TERGA:5\n",
" AEINR 19 points 3 words INANER:6 NARINE:6 RAINIER:7\n",
" AEIRT 135 points 20 words ARIETTA:7 ARIETTE:7 ARTIER:6 ATTIRE:6 ATTRITE:7 IRATE:5 IRATER:6 IRRITATE:8\n",
" ITERATE:7 RATITE:6 RATTIER:7 REITERATE:9 RETIA:5 RETIARII:8 TARRIER:7 TATTIER:7 TEARIER:7 TERAI:5\n",
" TERRARIA:8 TITRATE:7\n",
" AENRT 132 points 19 words ANTEATER:8 ANTRE:5 ENTERA:6 ENTRANT:7 ENTREAT:7 ERRANT:6 NARRATE:7 NARRATER:8\n",
" NATTER:6 NEATER:6 RANTER:6 RATTEEN:7 RATTEN:6 RATTENER:8 REENTRANT:9 RETREATANT:10 TANNER:6 TERNATE:7\n",
" TERRANE:7\n",
" AGINR 138 points 19 words AGRARIAN:8 AIRING:6 ANGARIA:7 ARRAIGN:7 ARRAIGNING:10 ARRANGING:9 GARAGING:8\n",
" GARNI:5 GARRING:7 GNARRING:8 GRAIN:5 GRAINING:8 INGRAIN:7 INGRAINING:10 RAGGING:7 RAGING:6 RAINING:7\n",
" RANGING:7 RARING:6\n",
" AGIRT 5 points 1 word TRAGI:5\n",
" AGNRT 5 points 1 word GRANT:5\n",
" AINRT 64 points 9 words ANTIAIR:7 ANTIAR:6 ANTIARIN:8 INTRANT:7 IRRITANT:8 RIANT:5 TITRANT:7 TRAIN:5\n",
" TRINITARIAN:11\n",
" EGINR 186 points 24 words ENGINEER:8 ENGINEERING:11 ERRING:6 GINGER:6 GINGERING:9 GINNER:6 GINNIER:7\n",
" GREEING:7 GREENIE:7 GREENIER:8 GREENING:8 GRINNER:7 NIGGER:6 REENGINEER:10 REENGINEERING:13\n",
" REGREENING:10 REIGN:5 REIGNING:8 REINING:7 RENEGING:8 RENIG:5 RENIGGING:9 RERIGGING:9 RINGER:6\n",
" EGIRT 27 points 4 words GRITTIER:8 TERGITE:7 TIGER:5 TRIGGER:7\n",
" EGNRT 12 points 2 words GERENT:6 REGENT:6\n",
" EINRT 190 points 29 words ENTIRE:6 INERT:5 INTER:5 INTERN:6 INTERNE:7 INTERNEE:8 INTERTIE:8 NETTIER:7\n",
" NITER:5 NITERIE:7 NITRE:5 NITRITE:7 NITTIER:7 REINTER:7 RENITENT:8 RENTIER:7 RETINE:6 RETINENE:8\n",
" RETINITE:8 RETINT:6 TEENIER:7 TENTIER:7 TERRINE:7 TINIER:6 TINNER:6 TINNIER:7 TINTER:6 TRIENE:6\n",
" TRINE:5\n",
" GINRT 43 points 6 words GIRTING:7 GRITTING:8 RINGGIT:7 TIRING:6 TRIGGING:8 TRINING:7\n",
" AEGR 84 points 17 words AGER:1 AGGER:5 AGREE:5 ARREARAGE:9 EAGER:5 EAGERER:7 EAGRE:5 EGGAR:5 GAGER:5\n",
" GAGGER:6 GARAGE:6 GEAR:1 RAGE:1 RAGEE:5 RAGGEE:6 REGEAR:6 REGGAE:6\n",
" AEIR 22 points 4 words AERIE:5 AERIER:6 AIRER:5 AIRIER:6\n",
" AENR 40 points 9 words ANEAR:5 ARENA:5 EARN:1 EARNER:6 NEAR:1 NEARER:6 RANEE:5 REEARN:6 RERAN:5\n",
" AERT 127 points 24 words AERATE:6 ARETE:5 EATER:5 ERRATA:6 RATE:1 RATER:5 RATTER:6 REATA:5 RETEAR:6\n",
" RETREAT:7 RETREATER:9 TARE:1 TARRE:5 TARTER:6 TARTRATE:8 TATER:5 TATTER:6 TEAR:1 TEARER:6 TERRA:5\n",
" TERRAE:6 TETRA:5 TREAT:5 TREATER:7\n",
" AGIR 6 points 2 words AGRIA:5 RAGI:1\n",
" AGNR 13 points 5 words GNAR:1 GNARR:5 GRAN:1 GRANA:5 RANG:1\n",
" AGRT 13 points 3 words GRAT:1 RAGTAG:6 TAGRAG:6\n",
" AINR 8 points 4 words AIRN:1 NAIRA:5 RAIN:1 RANI:1\n",
" AIRT 21 points 5 words AIRT:1 ATRIA:5 RIATA:5 TIARA:5 TRAIT:5\n",
" ANRT 50 points 10 words ANTRA:5 ARRANT:6 RANT:1 RATAN:5 RATTAN:6 TANTARA:7 TANTRA:6 TARN:1 TARTAN:6\n",
" TARTANA:7\n",
" EGIR 17 points 3 words GREIGE:6 RERIG:5 RIGGER:6\n",
" EGNR 37 points 6 words GENRE:5 GREEN:5 GREENER:7 REGREEN:7 RENEGE:6 RENEGER:7\n",
" EGRT 45 points 7 words EGRET:5 GETTER:6 GREET:5 GREETER:7 REGREET:7 REGRET:6 REGRETTER:9\n",
" EINR 17 points 4 words INNER:5 REIN:1 RENIN:5 RENNIN:6\n",
" EIRT 87 points 17 words RETIE:5 RETIRE:6 RETIREE:7 RETIRER:7 RITE:1 RITTER:6 TERRIER:7 TERRIT:6 TIER:1\n",
" TIRE:1 TITER:5 TITRE:5 TITTER:6 TITTERER:8 TRIER:5 TRITE:5 TRITER:6\n",
" ENRT 104 points 19 words ENTER:5 ENTERER:7 ENTREE:6 ETERNE:6 NETTER:6 REENTER:7 RENNET:6 RENT:1 RENTE:5\n",
" RENTER:6 RETENE:6 TEENER:6 TENNER:6 TENTER:6 TERN:1 TERNE:5 TERREEN:7 TERRENE:7 TREEN:5\n",
" GINR 44 points 9 words GIRN:1 GIRNING:7 GRIN:1 GRINNING:8 IRING:5 RIGGING:7 RING:1 RINGING:7 RINNING:7\n",
" GIRT 3 points 3 words GIRT:1 GRIT:1 TRIG:1\n",
" AER 25 points 7 words AREA:1 AREAE:5 ARREAR:6 RARE:1 RARER:5 REAR:1 REARER:6\n",
" AGR 2 points 2 words AGAR:1 RAGA:1\n",
" AIR 2 points 2 words ARIA:1 RAIA:1\n",
" ART 24 points 5 words ATTAR:5 RATATAT:7 TART:1 TARTAR:6 TATAR:5\n",
" EGR 15 points 4 words EGER:1 EGGER:5 GREE:1 GREEGREE:8\n",
" EIR 11 points 2 words EERIE:5 EERIER:6\n",
" ENR 1 point 1 word ERNE:1\n",
" ERT 27 points 7 words RETE:1 TEETER:6 TERETE:6 TERRET:6 TETTER:6 TREE:1 TRET:1\n",
" GIR 7 points 2 words GRIG:1 GRIGRI:6\n"
]
}
],
"source": [
"report()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Step 10: What honeycombs have a high score without a lot of words?\n",
"\n",
"Michael Braverman said he dislikes puzzles with a lot of low-scoring four-letter words. Can we find succint puzzles with lots of points but few words? With two objectives there won't be a single best answer to this question; rather we can ask: what honeycombs are there such that there are no other honeycombs with both more points and fewer words? We say such honeycombs are [**Pareto optimal**](https://en.wikipedia.org/wiki/Pareto_efficiency) and are on the **Pareto frontier**. We can find them as follows:"
]
},
{
"cell_type": "code",
"execution_count": 42,
"metadata": {},
"outputs": [],
"source": [
"def pareto_honeycombs(words) -> list: \n",
" \"\"\"A table of {word_count: (points, honeycomb)} with highest scoring honeycomb.\"\"\"\n",
" points_table = tabulate_points(words)\n",
" wcount_table = Counter(map(letterset, words))\n",
" honeycombs = (Honeycomb(letters, center) \n",
" for letters in points_table if len(letters) == 7 \n",
" for center in letters)\n",
" # Build a table of {word_count: (points, honeycomb)}\n",
" table = defaultdict(lambda: (0, None)) \n",
" for h in honeycombs:\n",
" points = game_score2(h, points_table)\n",
" wcount = game_score2(h, wcount_table)\n",
" table[wcount] = max(table[wcount], (points, h))\n",
" return pareto_frontier(table)\n",
" \n",
"def pareto_frontier(table) -> list:\n",
" \"\"\"The pareto frontier that minimizes word counts while maximizing points.\n",
" Returns a list of (wcount, points, honeycomb, points/wcount) entries\n",
" such that there is no other entry that has fewer words and more points.\"\"\"\n",
" return [(w, p, h, round(p/w, 2))\n",
" for w, (p, h) in sorted(table.items())\n",
" if not any(h2 != h and w2 <= w and p2 >= p\n",
" for w2, (p2, h2) in table.items())]"
]
},
{
"cell_type": "code",
"execution_count": 43,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"108"
]
},
"execution_count": 43,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ph = pareto_honeycombs(enable1)\n",
"len(ph)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"So there are 108 (out of 55,902) honeycombs on the Pareto frontier. We can see the first ten (sorted by word count), and every tenth one after that:"
]
},
{
"cell_type": "code",
"execution_count": 44,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[(1, 15, Honeycomb('BIMNRUV', 'V'), 15.0),\n",
" (2, 26, Honeycomb('DHNORTX', 'X'), 13.0),\n",
" (3, 31, Honeycomb('CILMOQU', 'Q'), 10.33),\n",
" (4, 32, Honeycomb('BGINOUX', 'X'), 8.0),\n",
" (5, 45, Honeycomb('CEGIPTX', 'G'), 9.0),\n",
" (6, 50, Honeycomb('DELNPUZ', 'Z'), 8.33),\n",
" (7, 62, Honeycomb('BGILNOX', 'X'), 8.86),\n",
" (8, 67, Honeycomb('DGINOXZ', 'X'), 8.38),\n",
" (9, 70, Honeycomb('EFNQRTU', 'Q'), 7.78),\n",
" (10, 84, Honeycomb('CENOQRU', 'Q'), 8.4)]"
]
},
"execution_count": 44,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ph[:10] # (word count, points, honeycomb, points/wcount) "
]
},
{
"cell_type": "code",
"execution_count": 45,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[(11, 86, Honeycomb('GINOTUV', 'V'), 7.82),\n",
" (23, 184, Honeycomb('ACELQRU', 'Q'), 8.0),\n",
" (55, 385, Honeycomb('ACINOTU', 'U'), 7.0),\n",
" (73, 496, Honeycomb('CENORTV', 'V'), 6.79),\n",
" (98, 651, Honeycomb('FGILNPU', 'N'), 6.64),\n",
" (135, 929, Honeycomb('CEGILNR', 'G'), 6.88),\n",
" (180, 1207, Honeycomb('BEGINRT', 'G'), 6.71),\n",
" (230, 1566, Honeycomb('BDEGINR', 'G'), 6.81),\n",
" (275, 1856, Honeycomb('EGILNRT', 'R'), 6.75),\n",
" (354, 2414, Honeycomb('ACEINRT', 'N'), 6.82)]"
]
},
"execution_count": 45,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ph[10::10]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's see what the frontier looks like by plotting word counts versus points scored:"
]
},
{
"cell_type": "code",
"execution_count": 46,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYsAAAEGCAYAAACUzrmNAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjMsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+AADFEAAAgAElEQVR4nO3dfZTdVX3v8fdnZpKAgiY8TQMJBJpoBbwCGSEW2ztBisC1BhUFdFlUbOy9cBcs9Sr0ASmULuy1orYUmwoVWyCgQMllwaU8HanWBJjwlBCQMTAwJsIlOQFGhGRmvveP3z6TM5MzOZPDnMf5vNY665zf/u3f7+ydTOab/fDbWxGBmZnZzrTVuwBmZtb4HCzMzKwsBwszMyvLwcLMzMpysDAzs7I66l2Aathnn31i3rx5FV3761//mre+9a2TW6AG4zo2v1avH7iO9dDT0/NSROxb6lxLBot58+bx0EMPVXRtLpeju7t7cgvUYFzH5tfq9QPXsR4k9Y13rurdUJLaJT0s6bZ0fLCkVZKelnSDpOkpfUY67k3n5xXd44KU/pSkD1a7zGZmNlotxizOBdYVHX8duDwiFgB54KyUfhaQj4j5wOUpH5IOBU4HDgNOBP5BUnsNym1mZklVg4WkOcB/A76XjgUcB/woZbkGOCV9XpKOSec/kPIvAZZHxBsR8QzQCxxdzXKbmdlo1R6z+BbwFWDPdLw3sCUiBtNxP3BA+nwA8DxARAxKejnlPwBYWXTP4mtGSFoKLAXo7Owkl8tVVOCBgYGKr20WrmPza/X6gevYaKoWLCR9CHgxInokdReSS2SNMud2ds32hIhlwDKArq6uqHTQqNEGnKrBdWx+rV4/cB0bTTVbFscCH5Z0MrAb8DaylsZMSR2pdTEH2JDy9wNzgX5JHcDbgc1F6QXF15iZWQ1UbcwiIi6IiDkRMY9sgPreiPgUcB9wasp2JnBr+rwiHZPO3xvZkrgrgNPTbKmDgQXAA9Uqt5lZs+rpy3PFfb309OUn/d71eM7iq8BySX8FPAxcldKvAv5FUi9Zi+J0gIhYK+lG4AlgEDg7IoZqX2wzs8bV05fnU99bydbBYaZ3tHHt5xex8KBZk3b/mgSLiMgBufR5PSVmM0XE68DHx7n+UuDS6pXQzKy5rVy/ia2DwwwHbBscZuX6TZMaLLw2lJlZC1h0yN5M72ijXTCto41Fh+w9qfdvyeU+zMymmoUHzeLazy9i5fpNLDpk70ltVYCDhZlZy1h40KxJDxIF7oYyM2sw1ZzVVCm3LMzMGki1ZzVVyi0LM7MGUmpWUyNwsDAzayDVntVUKXdDmZk1kGrPaqqUg4WZWYOp5qymSrkbyszMynKwMDOzshwszMysLAcLMzMry8HCzMzKcrAwM7OyHCzMzKysqgULSbtJekDSo5LWSvrLlP59Sc9IeiS9jkjpkvQdSb2SHpN0VNG9zpT0dHqdOd53mplZdVTzobw3gOMiYkDSNOAnku5I5/5XRPxoTP6TyPbXXgAcA1wJHCNpL+BrQBcQQI+kFRHROMsxmpm1uKq1LCIzkA6npVfs5JIlwA/SdSuBmZJmAx8E7oqIzSlA3AWcWK1ym5nZjqo6ZiGpXdIjwItkv/BXpVOXpq6myyXNSGkHAM8XXd6f0sZLNzOzGqnq2lARMQQcIWkmcIukw4ELgF8B04FlwFeBiwGVusVO0keRtBRYCtDZ2Ukul6uozAMDAxVf2yxcx+bX6vUD17HR1GQhwYjYIikHnBgR30jJb0j6Z+DL6bgfmFt02RxgQ0rvHpOeK/Edy8iCD11dXdHd3T02y4TkcjkqvbZZuI7Nr9XrB65jo6nmbKh9U4sCSbsDxwNPpnEIJAk4BViTLlkB/FGaFbUIeDkiNgJ3AidImiVpFnBCSjMzq0il25Y24nantVLNlsVs4BpJ7WRB6caIuE3SvZL2JeteegT4k5T/duBkoBd4DfgsQERslnQJ8GDKd3FEbK5iuc2shVW6bWmjbndaK1ULFhHxGHBkifTjxskfwNnjnLsauHpSC2hmU1KpbUsn8ku/0utahZ/gNrMppdJtSxt1u9Na8U55ZjalVLptaaNud1orDhZmNuVUum1pI253WivuhjIzs7IcLMzMrCwHCzMzK8vBwszMynKwMDOzshwszMysLAcLMzMry8HCzMzKcrAwM7OyHCzMzKwsBwsza1hTef+IRuO1ocysIfXmh/jGPVN3/4hG45aFmTWkJzcP7bB/hNWPg4WZNaTf2at9Su8f0WiquQf3bpIekPSopLWS/jKlHyxplaSnJd0gaXpKn5GOe9P5eUX3uiClPyXpg9Uqs5k1jvmz2rn284v44gnvdBdUA6jmmMUbwHERMSBpGvATSXcAXwQuj4jlkr4LnAVcmd7zETFf0unA14HTJB0KnA4cBuwP3C3pHRExVMWym1kDmMr7RzSaqrUsIjOQDqelVwDHAT9K6dcAp6TPS9Ix6fwHJCmlL4+INyLiGaAXOLpa5TYzsx1VdTaUpHagB5gPXAH8AtgSEYMpSz9wQPp8APA8QEQMSnoZ2Dulryy6bfE1xd+1FFgK0NnZSS6Xq6jMAwMDFV/bLFzH5tfq9QPXsdFUNVikrqIjJM0EbgHeVSpbetc458ZLH/tdy4BlAF1dXdHd3V1JkcnlclR6bbNwHZtfq9cPXMdGU5PZUBGxBcgBi4CZkgpBag6wIX3uB+YCpPNvBzYXp5e4xszMaqCas6H2TS0KJO0OHA+sA+4DTk3ZzgRuTZ9XpGPS+XsjIlL66Wm21MHAAuCBapXbzMx2VM1uqNnANWncog24MSJuk/QEsFzSXwEPA1el/FcB/yKpl6xFcTpARKyVdCPwBDAInO2ZUGZmtVW1YBERjwFHlkhfT4nZTBHxOvDxce51KXDpZJfRzMwmxk9wm5lZWQ4WZmZWloOFmZmV5WBhZmZlOViYmVlZDhZmZlaWg4WZmZXlYGFmZmU5WJiZWVkOFmZmVpaDhZmZlVXV/SzMzHZVT1+eles3MWPLEN31LoyNcLAws4bR05fnU99bydbBYToERx6V9x7cDcLdUGbWMFau38TWwWGGAwaHs2NrDA4WZjapevryXHFfLz19+V2+dtEhezO9o412QUdbdmyNwd1QZjZpiruRpne0ce3nF+1SN9LCg2Zx7ecXpTGLPndBNZBqbqs6V9J9ktZJWivp3JR+kaRfSnokvU4uuuYCSb2SnpL0waL0E1Nar6Tzq1VmM3tziruRtg0OV9SNtPCgWZy9eD7zZ7VXoYRWqWq2LAaBL0XEakl7Aj2S7krnLo+IbxRnlnQo2VaqhwH7A3dLekc6fQXwB0A/8KCkFRHxRBXLbmYVKHQjbRscZlpHm7uRWkg1t1XdCGxMn1+VtA44YCeXLAGWR8QbwDNpL+7C9qu9aTtWJC1PeR0szBpMcTfSokP2djdSC1FEVP9LpHnA/cDhwBeBzwCvAA+RtT7ykv4eWBkR/5quuQq4I93ixIj4fEr/NHBMRJwz5juWAksBOjs7Fy5fvryisg4MDLDHHntUdG2zcB2bX6vXD1zHeli8eHFPRHSVOlf1AW5JewA3AedFxCuSrgQuASK9/y3wOUAlLg9Kj6vsEOEiYhmwDKCrqyu6u7srKm8ul6PSa5uF69j8Wr1+4Do2mqoGC0nTyALFtRFxM0BEvFB0/p+A29JhPzC36PI5wIb0ebx0MzOrgWrOhhJwFbAuIr5ZlD67KNtHgDXp8wrgdEkzJB0MLAAeAB4EFkg6WNJ0skHwFdUqt5mN7808Q2HNrZoti2OBTwOPS3okpf0pcIakI8i6kp4FvgAQEWsl3Ug2cD0InB0RQwCSzgHuBNqBqyNibRXLbWYlvNlnKKy5VXM21E8oPQ5x+06uuRS4tET67Tu7zsyqr9QzFA4WU4eX+zCzCSleisPPUEw9Xu7DzCbEz1BMbQ4WZlNQYc+IXf2lv/CgWQ4SU5SDhdkU44Fqq4THLMymmMlY7M+mHgcLsynGA9VWCXdDmU0xHqi2SjhYmE1BHqi2XeVuKLMW5yU6bDK4ZWHWwjzzySbLhFoWks6V9DZlrpK0WtIJ1S6cmb05nvlkk2Wi3VCfi4hXgBOAfYHPApdVrVRmNik888kmy0S7oQoLAp4M/HNEPJqWIDezBjL2yWzPfLLJMtFg0SPp34GDgQsk7QkMV69YZrarxhuf8MwnmwwT7YY6CzgfeG9EvAZMJ+uKMrMG4fEJq6aJBou7ImJ1RGwBiIhNwOXVK5aZ7SqPT1g17bQbStJuwFuAfSTNYvvYxduA/atcNjMbozAmMWPLEN1jznl8wqqp3JjFF4DzyAJDD9uDxSvAFTu7UNJc4AfAb5GNbyyLiG9L2gu4AZhHtq3qJyIinwbMv002iP4a8JmIWJ3udSbw5+nWfxUR1+xCHc2a0tjB6uIxiQ7BkUfldwgIHp+watlpsIiIbwPflvQ/I+LvdvHeg8CXImJ1GhDvkXQX8Bngnoi4TNL5ZGMhXwVOAhak1zHAlcAxKbh8Degi27e7R9KKiPDjqNaySg1WF49JDAbe1tRqakKzoSLi7yT9LllroKMo/Qc7uWYjsDF9flXSOuAAYAmMtKCvAXJkwWIJ8IOICGClpJmSZqe8d0XEZoAUcE4Erp9oJc2aQU9fnptW948038cOVhfGJLYNDtMuPCZhNTWhYCHpX4DfBh4BhlJykHUzTeT6ecCRwCqgMwUSImKjpP1StgOA54su609p46WP/Y6lwFKAzs5OcrncRIq2g4GBgYqvbRauY+PpzQ9x2QOvMxjZcbugTUBkn2ds6ePVZ/r58lHTeXLzEAfuvpVXn3mU3DN1LXZVNdvfYSWaqY4Tfc6iCzg0/a9/l0jaA7gJOC8iXtnJs3ylTsRO0kcnRCwDlgF0dXVFd3f3rhYVgFwuR6XXNgvXsfGsva+XoXhq5Hg44IyjD+SAmbuPGqzuTuebrX6VcB0by0SDxRqygeqNu3JzSdPIAsW1EXFzSn5B0uzUqpgNvJjS+4G5RZfPATak9O4x6bldKYdZo1t0yN5Maxdbh7L/B03raONjR83xmIQ1jIkGi32AJyQ9ALxRSIyID493QZrddBWwLiK+WXRqBXAm2dpSZwK3FqWfI2k52QD3yymg3An8dZq6C9n6VBdMsNxmTePUrrm89Oob7LvnDD7qQGENZqLB4qIK7n0s8GngcUmPpLQ/JQsSN0o6C3gO+Hg6dzvZtNlesqmznwWIiM2SLgEeTPkuLgx2mzW7wqD2j3r6GRzyMuLWuCY6G+rHu3rjiPgJpccbAD5QIn8AZ49zr6uBq3e1DGaN7LpVz3HhrWsYGo6RQbjCzCcHC2s05Z7g/klEvF/Sq4weVBbZ7/e3VbV0Zi2m8KDdrLdM58Jb1zA4vP2flfAyHda4yj2U9/70vmdtimPWmsZ2N7VJDBUFinbB6Ucf6LEKa1gT3lZV0nuA30uH90fEY9UpkllrKTyN/ca24e3N8wja28TwcNDWJi5ecjifPObAehbTbKcm+lDeucAfA4Xpr9dKWlbBEiBmLan46euxrYObV/ePChQCpk9r48IPHUb+ta1e9M+awkRbFmcBx0TErwEkfR34GeBgYVNeT1+eM5b9bOQZiR/29HP9Hy8aWfzvhw89PxIoOtrFaV1z3d1kTWdXtlUdKjoeYvyZTmYtr3hF2JXrN7FtaPv4Q/GMppXrN40MYgv4RNdcLv3Iu+tUarPKTTRY/DOwStIt6fgUsgfuzKacsSvCXvihw3Z4+rowo6l48b/CU9lmzWiiz1l8U1IOeD/Zf5A+GxEPV7NgZo1q7Pal+de2cv3S95Ucs/CGRNYqJrJT3p8A84HHgX+IiMFaFMysERQ/F1EYjB7bWigEgfECgTckslZQrmVxDbAN+A+yzYneRbZznlnLGzvltU2MLMfh1oJNNeWCxaER8W4ASVcBD1S/SGaNodDdVBi6Lt6I6OzF8x0kbEppK3N+W+GDu59sqil0NxX+kbTJy3HY1FWuZfEeSa+kzwJ2T8deG8paWmGsovDgXPGYhVsUNhWVWxuqvVYFMWsUY6fGeslws/LdUGZTQk9fnivu6x1pURSmxm5NYxRmU92EFxI0a1VjWxKfed88CgvCDgfMesv0+hbQrAFUrWUh6WpJL0paU5R2kaRfSnokvU4uOneBpF5JT0n6YFH6iSmtV9L51SqvTV1jH7Jbu/GVkbVs2oD8a1vrWTyzhlDNbqjvAyeWSL88Io5Ir9sBJB0KnA4clq75B0ntktqBK8ie8TgUOCPlNZs0hVlP7Wm200mHz2bGtOx4+jTPfjKDKnZDRcT9kuZNMPsSYHlEvAE8I6kXODqd642I9QCSlqe8T0xycW0KK7Ukxzt/a08/dGdWRNnW11W6eRYsbouIw9PxRcBngFeAh4AvRURe0t8DKyPiX1O+q4A70m1OjIjPp/RPky2Vfk6J71oKLAXo7OxcuHz58orKPDAwwB577FHRtc3CdRytNz/Ek5uH+J292pk/qzkmAPrvsDU0Wh0XL17cExFdpc7VeoD7SuASsv28LwH+FvgcpZc7D0p3k5WMbhGxDFgG0NXVFd3d3RUVMJfLUem1zcJ1zIzd6nR6x1DTTJP132FraKY61jRYRMQLhc+S/gm4LR32A3OLss4BNqTP46Wb7ZLiRQHXbHiZH/X0s61oOY/ifSjMbLSaBgtJsyNiYzr8CFCYKbUCuE7SN4H9gQVk61AJWCDpYOCXZIPgn6xlma059eaHWHtf78iYw9hFAcXoJqrwUh5mO1O1YCHpeqAb2EdSP/A1oFvSEWT/Tp8FvgAQEWsl3Ug2cD0InB0RQ+k+5wB3Au3A1RGxtlplttbQ05fnbx58ncF4auQJ7LGLAhbvhz2tXXzcW52a7VQ1Z0OdUSJ53N31IuJS4NIS6bcDt09i0awFFG9rOvYX/Mr1m9g2nAWEQtdSYXrs1m3DDJMtCtjR5iBhNlF+gtuaTrm1mxYdsjfT2mAoGLU5UaGF4UUBzXadg4U1jUJrYsOW34x64nrsoPTCg2bxlffuxhszDxoVELxjnVnlHCysKRS3JjraREd7G0NDw+MOSs+f1U539/w6lNSsNTlYWFO4eXX/yEymoeHgtKPncsDM3d2VZFYjDhbW8Hr68vzwoedHZjC1t7fxMQ9Km9WU97Owhrdy/SYG05rhAk5d6EBhVmsOFtbwileFnTEta1WYWW25G8qawkePmoPSu1sVZrXnYGENo3jtpsJzEMCoZyo+6laFWV04WFhDGLt2U5tgekfW5bSzZyrMrDYcLKxuilsSd6zZOGrtpkJwCLKgsW1w/GcqzKz6HCysLsZbBbbw3pa2OP3YUXP42FFzvGudWZ05WFjN9fTl+dbdP99hFdg24NgF+3DS4bN3WLvJQcKsvhwsrKbGa1EUxijOO/4dDgxmDcjBwqqueDnx4n0ldtaSMLPG4mBhVXXdque48NY1DEcwvaONCz902KgBa7ckzJqDg4VVTU9fngtvXTOyVMfWwWHyr20d2VfCLQmz5lG15T4kXS3pRUlritL2knSXpKfT+6yULknfkdQr6TFJRxVdc2bK/7SkM6tVXpt8K9dvYmh4+07XbdJIgDh78XwHCrMmUs21ob4PnDgm7XzgnohYANyTjgFOAhak11LgSsiCC9ne3ccARwNfKwQYazw9fXmuuK+Xnr48kK3pNGNaG21kW5hevORwBwizJlXNPbjvlzRvTPISoDt9vgbIAV9N6T+IiABWSpopaXbKe1dEbAaQdBdZALq+WuW2yoy31am7nMxaQ63HLDojYiNARGyUtF9KPwB4vihff0obL30HkpaStUro7Owkl8tVVMCBgYGKr20Wk1nH3vwQT24eYtNvYmQ67NZtw1x/94O8+tvTAThM8Ooz/eSemZSvnJBW/3ts9fqB69hoGmWAWyXSYifpOyZGLAOWAXR1dUV3d3dFBcnlclR6bbOYrDr29OX5m7tXsm1wmPZ2Ma1j+1anZxz/3rq2JFr977HV6weuY6OpdbB4QdLs1KqYDbyY0vuBuUX55gAbUnr3mPRcDcppE3Dz6n62Dg4DMDgUnHDofrxn7kx3OZm1oFpvfrQCKMxoOhO4tSj9j9KsqEXAy6m76k7gBEmz0sD2CSnNGsDYJt4+e87wLCezFlW1loWk68laBftI6ieb1XQZcKOks4DngI+n7LcDJwO9wGvAZwEiYrOkS4AHU76LC4PdVntj95s4fP+3M71dbBsKprXLO9iZtbBqzoY6Y5xTHyiRN4Czx7nP1cDVk1g02wXFAeLi29busN/ERR8+3Et1mE0BjTLAbQ2oeDpsm8RwxA77TeRf28rZi+fXtZxmVn0OFjaum1f3j7QkiKCtTRDBMNv3m/BmRGZTg4OFjSheHRbghw89P9KS6Oho46I/PIz8a1tH7ZHtriezqcHBwoAdn8D+2FFzRhYAFHDqwjl88pgD61tIM6ubWk+dtQZV6HIau/d1u2DGtDbPdDKb4tyymIJ680Osva93pBuppy/PDUVdTu3t3vvazEZzsJhievry/M2DrzMYT40s+Hfz6n4Gh7Y/Ytf9jn2997WZjeJgMcWsXL+JbcPZ09fbBodZuX5TySexzcyKOVhMIT19eX655Te0t0HE6KmvP3roeT+JbWbjcrCYIkY9YAecfvSBfPSoOSPdTNcvfZ/HJ8xsXA4WLa7w7MSGLb9h62A22ymA/WfuPiooLDxoloOEmY3LwaJF9fTl+e6Pf8G9T75IRNCmbIkOyILFrLdMr2v5zKy5OFi0mJ6+PDet7ufGh54fNcMpikaxBeRf21r7wplZ03KwaCGFcYmR9ZyKtCl7fmJoaJh24TWdzGyXOFi0kFEL/xVpbxOXLDmcd/7Wnqxcv4kZW/o8PmFmu8TBogUUxifuWffC9oX/2sVx79yPffecMWrW08KDZpHL9devsGbWlOoSLCQ9C7wKDAGDEdElaS/gBmAe8CzwiYjISxLwbbKd9F4DPhMRq+tR7kZRvDrsU796lb/4t8cZGjMm8Ymuufz1R95dtzKaWWupZ8ticUS8VHR8PnBPRFwm6fx0/FXgJGBBeh0DXJnep6TrVj3HX9y6hqHhoKNdRDAqUEDW7eQH68xsMjVSN9QSsj27Aa4BcmTBYgnwg7T16kpJMyXNjoiNdSllnRS6mu5+YntX0+DYKEEWKC5ecrjHJMxsUilix184Vf9S6RkgTzbl/x8jYpmkLRExsyhPPiJmSboNuCwifpLS7wG+GhEPjbnnUmApQGdn58Lly5dXVLaBgQH22GOPiq6tht78ED/95Tbu/+XQDi0IyGY5RWRdT0fs187JB09j/qz2nd6z0epYDa1ex1avH7iO9bB48eKeiOgqda5eLYtjI2KDpP2AuyQ9uZO8KpG2w6/NiFgGLAPo6uqK7u7uigqWy+Wo9NrJ1tOX5xv3lJ4KC9kg9sUfPnyXd61rpDpWS6vXsdXrB65jo6lLsIiIDen9RUm3AEcDLxS6lyTNBl5M2fuBuUWXzwE21LTAddDTl+fi/7OW17cN73CuTXD8uzr5wn/9bXc3mVlN1DxYSHor0BYRr6bPJwAXAyuAM4HL0vut6ZIVwDmSlpMNbL/c6uMVPX15Tlv2s1FjEuNNhTUzq4V6tCw6gVuyGbF0ANdFxP+V9CBwo6SzgOeAj6f8t5NNm+0lmzr72doXufqKp8N+/Y51OwxeeyqsmdVTzYNFRKwH3lMifRPwgRLpAZxdg6LVTHFgKGxrOrJ8uGBwTM+Tp8KaWb010tTZKaE4MBS2NV25ftPI8uHDJUayL/FUWDOrs7Z6F2CqKazfNBzbtzVddMjeTO9oo13QMeZv5E9+/xA+ecyB9SmsmVnilkUN9fTl+eFDz49Mg21vbxvpiiq0MApLeNyxZiMnHT7bgcLMGoKDRY309OX51t0/ZzD1Mwk4deHoBf6KPztImFkjcbCogetWPceFaT2nIHtOYnpHmwetzaxpOFhU2XWrnuPP/+3xkYFrAcfO34fzjn+HB63NrGk4WEyi4imxADet7ueGB58fNcOpvU0OFGbWdBwsJknxlNiONoHEtsHRazq1Ca8Ia2ZNycHiTerpy3PT6n4eeGbzyDpO24YCiFGBoiMtHe6BazNrRg4Wb0KpNZwA2ttFm8TQ0DDtbeLjXXO9npOZNTUHiwoVVoUttQHRJ7rm8rGj5oxa0sPMrJk5WFTgstvXsew/1pdcmqMwJbb4uQkzs2bnYDFBhbGJnz79En2bX9vhfBtw/KHeY8LMWpODxQRcdvs6vnv/+pLnBHzymAM9JmFmLc3Boozzlj/Mvz0y/sZ8X/j9Qzj/5HfVsERmZrXnYDGOnr48f37L46z71avj5jnliP0dKMxsSnCwGKM3P8QV3/1PHnw2P26e+fvtweeOPdjPTJjZlNE0wULSicC3gXbgexFx2WR/R09fnr9e9TrDvF7y/KGz9+SSU97tsQkzm3KaIlhIageuAP4A6AcelLQiIp6YzO+5eXU/w+OcO+WI/fnW6UdO5teZmTWNpggWwNFAb9q/G0nLgSXApAaLEo9NMGfmbvyPxQvc5WRmU1qzBIsDgOeLjvuBY4ozSFoKLAXo7Owkl8vt8pccwhDtCoZCCDhpXgef+J12+M16crnSU2eb0cDAQEV/Ps2k1evY6vUD17HRNEuwUIm0UQ2BiFgGLAPo6uqK7u7uXf6S7Ip7eGPmQS29TEcul6OSP59m0up1bPX6gevYaJolWPQDc4uO5wDjP/zwJsyf1U539/xq3NrMrGm11bsAE/QgsEDSwZKmA6cDK+pcJjOzKaMpWhYRMSjpHOBOsqmzV0fE2joXy8xsymiKYAEQEbcDt9e7HGZmU1GzdEOZmVkdOViYmVlZDhZmZlaWIko9t9zcJP0/oK/Cy/cBXprE4jQi17H5tXr9wHWsh4MiYt9SJ1oyWLwZkh6KiK56l6OaXMfm1+r1A9ex0bgbyszMynKwMDOzshwsdrSs3gWoAdex+bV6/cB1bCgeszAzs7LcsjAzs7IcLMzMrCwHiyKSTpT0lKReSefXuzyVknS1pBclrSlK20vSXZKeTu+zUrokfSfV+TFJR9Wv5BMjaa6k+yStk7RW0rkpvdXRQqEAAAXASURBVJXquJukByQ9mur4lyn9YEmrUh1vSKswI2lGOu5N5+fVs/wTJald0sOSbkvHrVa/ZyU9LukRSQ+ltKb8OXWwSIr2+T4JOBQ4Q9Kh9S1Vxb4PnDgm7XzgnohYANyTjiGr74L0WgpcWaMyvhmDwJci4l3AIuDs9HfVSnV8AzguIt4DHAGcKGkR8HXg8lTHPHBWyn8WkI+I+cDlKV8zOBdYV3TcavUDWBwRRxQ9T9GcP6cR4Vc2yP8+4M6i4wuAC+pdrjdRn3nAmqLjp4DZ6fNs4Kn0+R+BM0rla5YXcCvwB61aR+AtwGqyrYRfAjpS+sjPLNny/e9LnztSPtW77GXqNYfsl+VxwG1kO2K2TP1SWZ8F9hmT1pQ/p25ZbFdqn+8D6lSWauiMiI0A6X2/lN7U9U7dEUcCq2ixOqYumkeAF4G7gF8AWyJiMGUprsdIHdP5l4G9a1viXfYt4CvAcDrem9aqH2TbP/+7pB5JS1NaU/6cNs1+FjVQdp/vFtW09Za0B3ATcF5EvCKVqkqWtURaw9cxIoaAIyTNBG4B3lUqW3pvqjpK+hDwYkT0SOouJJfI2pT1K3JsRGyQtB9wl6Qnd5K3oevolsV2Ndvnu05ekDQbIL2/mNKbst6SppEFimsj4uaU3FJ1LIiILUCObHxmpqTCf/KK6zFSx3T+7cDm2pZ0lxwLfFjSs8Bysq6ob9E69QMgIjak9xfJAv7RNOnPqYPFdq2+z/cK4Mz0+Uyyfv5C+h+lmRiLgJcLTeRGpawJcRWwLiK+WXSqleq4b2pRIGl34HiygeD7gFNTtrF1LNT9VODeSB3fjSgiLoiIORExj+zf2r0R8SlapH4Akt4qac/CZ+AEYA3N+nNa70GTRnoBJwM/J+sb/rN6l+dN1ON6YCOwjex/K2eR9e/eAzyd3vdKeUU2C+wXwONAV73LP4H6vZ+sef4Y8Eh6ndxidfwvwMOpjmuAC1P6IcADQC/wQ2BGSt8tHfem84fUuw67UNdu4LZWq1+qy6PptbbwO6VZf0693IeZmZXlbigzMyvLwcLMzMpysDAzs7IcLMzMrCwHCzMzK8vBwqY0SZdLOq/o+E5J3ys6/ltJX3wT979I0pffbDkr+N4jJJ1c6++11uVgYVPdfwK/CyCpDdgHOKzo/O8CP53IjdLKxY3iCLJnT8wmhYOFTXU/JQULsiCxBnhV0ixJM8jWY3o4PVX7vyWtSfsTnAYgqVvZ3hrXkT1IhaQ/U7Yvyt3AO0t9qaROSbco26/iUUmFgPXF9B1rCi0eSfM0em+SL0u6KH3OSfq6sr0vfi7p99IKBBcDp6V9FE6b9D81m3K8kKBNaZEt8jYo6UCyoPEzspU+30e2suljEbFV0sfI/rf+HrLWx4OS7k+3ORo4PCKekbSQbPmKI8n+fa0Gekp89XeAH0fER1KLZI907WfJliIXsErSj8n2ddiZjog4OnU7fS0ijpd0IdkTwOdU9idjNppbFmbbWxeFYPGzouP/THneD1wfEUMR8QLwY+C96dwDEfFM+vx7wC0R8VpEvML464sdR9rcJt3z5fQdt0TEryNiALg53a+cwkKKPWT7mJhNOgcLs+3jFu8m64ZaSdayKB6vGHf9c+DXY44rXUNnvO8YZPS/1d3GnH8jvQ/h3gKrEgcLsywgfAjYnP6XvxmYSRYwfpby3E82BtAuaV/g98kWtBvrfuAjknZPK47+4TjfeQ/w32Fkk6O3pWtPkfSWtErpR4D/AF4A9pO0dxpH+dAE6vQqsOcE8plNiIOFWTYwvQ9Zi6I47eWIeCkd30K2AuyjwL3AVyLiV2NvFBGrgRvIVsK9ieyXfSnnAoslPU7WfXRYuvb7ZEFoFfC9iHg4IraRDVivItt+dGcb6BTcBxzqAW6bLF511szMynLLwszMynKwMDOzshwszMysLAcLMzMry8HCzMzKcrAwM7OyHCzMzKys/w8zfByNM89YXQAAAABJRU5ErkJggg==\n",
"text/plain": [
""
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"W, P, H, PPW = zip(*ph)\n",
"\n",
"def plot(xlabel, X, ylabel, Y): \n",
" plt.plot(X, Y, '.'); plt.xlabel(xlabel); plt.ylabel(ylabel); plt.grid(True)\n",
" \n",
"plot('Word count', W, 'Points', P, )"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"That's somewhat surprising; usually a Pareto frontier looks like a quarter-circle; here it looks like an almost straight line. Maybe we can get a better view by plotting word counts versus the number of points per word:"
]
},
{
"cell_type": "code",
"execution_count": 47,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAX4AAAEGCAYAAABiq/5QAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjMsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+AADFEAAAfeklEQVR4nO3da5Qc5X3n8e+/ZyRuwjBIIAsLRgwi2KAYrBEwNjiRbOwDXvAF7GPL7C6Lwcrm2Als1ktwvAcTXsWOHbxOSHwBG28ihA2CxdGGY24SEMwI1AJZozUXIRh5EEYgRoC4aC793xdV1eppdc9U90x1T1f9PufMma7qSz1PT8+vnn7qqafM3RERkezINbsAIiLSWAp+EZGMUfCLiGSMgl9EJGMU/CIiGdPe7ALEMWfOHF+wYEFdz33zzTc55JBDprZA00ja6weqY1qojo2Xz+dfcfcjy9e3RPAvWLCADRs21PXcdevWsXTp0qkt0DSS9vqB6pgWqmPjmVl/pfXq6hERyRgFv4hIxij4RUQyRsEvIpIxCn4RkYxR8IuIZEyqgz/fP8iaZ4fI9w82uygiItNGaoM/3z/IRTf0svqZYS66oVfhLyISSm3w927bxdBIAQeGRwr0btvV7CKJiEwLqQ3+nq7ZzGzPkQNmtOfo6Zrd7CKJiEwLqQ3+7s4OVl7WwwUnzGDlZT10d3Y0u0giItNCS8zVU6/uzg7eOH6mQl9EpERqW/wiIlKZgl9EJGMU/CIiGaPgFxHJGAW/iEjGKPhFRDJGwS8ikjEKfhGRjFHwi4hkjIJfRCRjFPwiIhmj4BcRyRgFv4hIxij4RUQyRsEvIpIxCn4RkYxR8IuIZExiwW9mPzGznWbWV+G+r5mZm9mcpLYvIiKVJdnivwk4p3ylmR0DfAzYnuC2RUSkisSC390fBF6tcNd1wJWAJ7VtERGpztyTy18zWwCscfdF4fIngY+6++Vm9jywxN1fqfLcFcAKgLlz53bfcsstdZVhz549zJo1q67ntoK01w9Ux7RQHRtv2bJleXdfst8d7p7YD7AA6AtvHwysBw4Ll58H5sR5ne7ubq/X2rVr635uK0h7/dxVx7RQHRsP2OAVMrWRo3qOB44DNoWt/fnARjN7dwPLICKSee2N2pC7bwaOipYn6uoREZFkJDmccxXwCHCimQ2Y2aVJbUtEROJLrMXv7ssnuH9BUtsWEZHqdOauiEjGKPhFRDJGwS8ikjEKfhGRjFHwi4hkjIJfRCRjFPwiIhmj4BcRyRgFv4hIxij4RUQyRsEvIpIxCn4RkYxR8IuIZIyCX0QkYxT8IiIZk/rg3zo4yvVrt5LvH2x2UUREpoWGXXqxGfL9g3z7sXcY8aeY2Z5j5WU9dHd2NLtYIiJNleoWf++2XQwXoOAwPFKgd9uuZhdJRKTpUh38PV2zmZGDNoMZ7Tl6umY3u0giIk2X6q6e7s4OrjztQPYe3klP12x184iIkPLgB1jY0cbSpQubXQwRkWkj1V09IiKyPwW/iEjGKPhFRDJGwS8ikjEKfhGRjFHwi4hkjIJfRCRjFPwiIhmj4BcRyRgFv4hIxij4RUQyRsEvIpIxVSdpM7PNgFe7393fn0iJREQkUePNznle+Psr4e9/Dn9fBLyVWIlERCRRVYPf3fsBzOxMdz+z5K6rzOxh4NqkCyciIlMvTh//IWZ2VrRgZh8CDkmuSCIikqQ4F2L5EvBTMzuMoM//tXDduMzsJwTdRTvdfVG47m+B84Eh4FngEnffXWfZRUSkDuO2+M0sByx091OA9wOnuvup7r4xxmvfBJxTtu4eYFF4YPhp4Ou1F1lERCZj3OB39wLw1fD26+7+WtwXdvcHgVfL1t3t7iPhYi8wv7biiojIZMXp47/HzL5mZseY2RHRzxRs+0vAXVPwOiIiUgNzrzpUP3iA2XMVVru7d0344mYLgDVRH3/J+m8AS4ALvEoBzGwFsAJg7ty53bfccstEm6toz549zJo1q67ntoK01w9Ux7RQHRtv2bJleXdfst8d7p7YD7AA6CtbdzHwCHBw3Nfp7u72eq1du7bu57aCtNfPXXVMC9Wx8YANXiFTJxzVY2YzgD8F/ihctQ74obsP17r3MbNzgL8E/tjddRKYiEgTxOnj/yegG/jH8Kc7XDcuM1tF0LI/0cwGzOxS4B+AQwmOGzxhZj+ou+QiIlKXOOP4T/NgOGfkfjPbNNGT3H15hdU3xi6ZiIgkIk6Lf9TMjo8WzKwLGE2uSCIikqQ4Lf7/Aaw1s22AAZ3AJYmWSkREEjNh8Lv7fWZ2AnAiQfA/6e57Ey+ZiIgkIs6onoeAB4GHgIcV+iIirS1OH//FwFPAhcCvzWyDmV2XbLFERCQpcbp6tpnZ2wQzag4By4D3JV0wERFJxoQtfjN7Fvg/wFyC4ZiL3L181k0REWkRcbp6vg9sB5YDfw5cXDq8U0REWsuEwe/u/8vdPwecDeSBawjm0hcRkRYUZ1TPd4GzgFkEUzBcTTDCR0REWlCcE7h6gW+7+0tJF0ZERJIXZ1TPrY0oiIiINEacg7siIpIiCn4RkYwZN/jNLGdmfY0qjIiIJG/c4Hf3ArDJzI5tUHlERCRhcUb1zAO2mNmjwJvRSnf/ZGKlEhGRxMQJ/r9OvBQiItIwcYZzPmBmncAJ7n6vmR0MtCVfNBERSUKcSdq+DNwG/DBc9R6CSdtERKQFxRnO+RXgTOB1AHd/BjgqyUKJiEhy4gT/XncfihbMrB3w5IokIiJJihP8D5jZXwEHmdnHgFuBf022WCIikpQ4wX8V8DKwGfgT4N+A/5lkoUREJDlxRvUUzOxnwHqCLp6n3F1dPSIiLSrOfPz/AfgB8CxgwHFm9ifuflfShRMRkakX5wSu7wLL3H0rQHjZxf8LKPhFRFpQnD7+nVHoh7YBOxMqj4iIJCxOi3+Lmf0b8AuCPv7PAY+Z2QUA7n57guUTEZEpFif4DwReAv44XH4ZOAI4n2BHoOAXEWkhcUb1XNKIgoiISGPoClwiIhmT+uDfOjjK9Wu3ku8fbHZRRESmhTh9/C0r3z/Itx97hxF/ipntOVZe1kN3Z0eziyUi0lRxpmW+3MzeZYEbzWyjmX28EYWbrN5tuxguQMFheKRA77ZdzS6SiEjTxenq+ZK7vw58HDgSuAT4m0RLNUV6umYzIwdtBjPac/R0zW52kUREmi5OV4+Fvz8B/NTdN5mZjfeE6aK7s4MrTzuQvYd30tM1W908IiLEC/68md0NHAd83cwOBQrJFmvqLOxoY+nShc0uhojItBGnq+dSgqmZT3P3t4CZBN094zKzn5jZTjPrK1l3hJndY2bPhL/VBBcRabA4wX+Pu290990A7r4LuC7G824CzilbdxVwn7ufANwXLouISANVDX4zO9DMjgDmmFlH2Fo/wswWAEdP9MLu/iDwatnqTwE/C2//DPh0XaUWEZG6WbVrqpjZ5cAVBCH/AvsO8r4O/Njd/2HCFw92EmvcfVG4vNvdDy+5f9DdK3b3mNkKYAXA3Llzu2+55ZaYVRprz549zJo1q67ntoK01w9Ux7RQHRtv2bJleXdfst8d7j7uD/BnEz1mnOcuAPpKlneX3T8Y53W6u7u9XmvXrq37ua0g7fVzVx3TQnVsPGCDV8jUOJO0/b2ZfSgM8faS9f+7jh3QS2Y2z91fNLN5aF5/EZGGi3PpxX8GjgeeAEbD1Q7UE/y/BC4mOAHsYuDOOl5DREQmIc44/iXASeHXhtjMbBWwlODg8ADwTYLA/4WZXQpsJ7ioi4iINFCc4O8D3g28WMsLu/vyKnd9tJbXERGRqRUn+OcA/8/MHgX2Rivd/ZOJlUpERBITJ/ivSboQIiLSOHFG9TzQiIKIiEhjVA1+M/t3dz/LzN4gGMVTvAtwd39X4qUTEZEpVzX43f2s8PehjSuOiIgkLdalF83sFODD4eKD7v6b5IokIiJJinXpRWAlcFT4s9LM/izpgomISDLitPgvBc5w9zcBzOxbwCPA3ydZMBERSUac+fiNfVM1EN5uiUsviojI/uK0+H8KrDezO8LlTwM3JlckERFJUpxx/H9nZuuAswha+pe4++NJF0xERJIx3jj+A4H/CiwENgP/6O4jjSqYiIgkY7w+/p8RzMy5GTgX+E5DSiQiIokar6vnJHf/QwAzuxF4tDFFEhGRJI3X4h+ObqiLR0QkPcZr8Z9iZq+Htw04KFzWXD0iIi1svLl62hpZkCTl+wfp3baLnq7ZdHd2NLs4IiJNFWuunla2dXCU79zXy9BIgZntOVZe1qPwF5FMi3Pmbkt78tVRhkYKFByGRwr0btvV7CKJiDRV6oP/vUe0MbM9R5tBW87Ysftt8v2DzS6WiEjTpD74F3a0sfKyHj5/+rFgxqpHt3PRDb0KfxHJrNQHP0B3ZwfvOfwgRkbV5SMikongB+jpml3s8pnRnqOna3aziyQi0hSpH9UT6e7sYOVlPRrWKSKZl5nghyD8FfgiknWZ6eoREZGAgl9EJGMU/CIiGaPgFxHJGAW/iEjGZGpUT75/kNUbBzDggsXzNcJHRDIpM8Gf7x9k+Y8eYWjUAbg1P8CqL2umThHJnsx09dy+caAY+gBDIwWu/dctk56zJ98/yPVrt2ruHxFpGZlo8ef7B7l1w+/2W79p4DWW/7iXa84/mcG3hmo+ozffP8hFN2iufxFpLZkI/t5tuxgpeMX7hkYKXH1nHwX3msO7d9uu/eb6V/CLyHSXia6e0gnaZrYZ7W1WvK8tZxTc65q1UxO/iUgrykSLv3yCNqA4uufkow/j2jVbGB4p1BzemvhNRFpRJoIf9p+grfT2ie8+tO7w1sRvItJqmhL8ZvbfgMsABzYDl7j7O40uR75/sBj4X1m2sNGbFxFpioYHv5m9B/hz4CR3f9vMfgF8AbgpqW2WBnzUOq82Iid6bMfBM+sa6SMiMt01q6unHTjIzIaBg4EdSW1o6+Ao37lv/4CvNCIH4KIbetk7XMCBnKFhmiKSOg0Pfnd/wcy+A2wH3gbudve7yx9nZiuAFQBz585l3bp1dW1v0+/fZu+w4cDQcIFV9z7GG8fP5IDdo7QbjDi0GRywu59V924rhj5Awcc+Zzras2dP3e9Nq1Ad00F1nD6a0dXTAXwKOA7YDdxqZv/R3f+l9HHu/iPgRwBLlizxpUuX1rW9rYP3cc+OoeKoneVnn0Z3ZwdLgQ8sHtsFlO8fZM3zvQwNFyiwr8UfPWc6WrduHfW+N61CdUwH1XH6aEZXz9nAc+7+MoCZ3Q58CPiXcZ9Vp4UdbVWHXFYa6RM9Vn38IpJWzQj+7UCPmR1M0NXzUWBDkhssD/hKB3vHWy8ikibN6ONfb2a3ARuBEeBxwi6dRrh5/faKUzRo3h0RyYqmTNng7t909/e6+yJ3/0/uvrcR2833D3L1nX2MFIIpGoZKRvNUG+UjItOfZsmtTWbO3IUg3EdLJmszYMfut8n3Dxbn3aln6gYRmdjWwVG2rN065V2p+rZeu0wFf0/XbA6YkWNouIAZ5HLGqke3s3rjACsv6xlzYDdq8esDJDJ5+f5Bvv3YO4z4U1Mezpolt3aZCv7SUTs7dr/Nqke3j/mwRNM2qPUgMrV6t+1iuBDM0TLV4axv67XLVPDDvhE+0fV3h0cKtOWs2OVT2noYmuADqlFAIvH0dM1mRg5GfeqnMNcsubXLXPBHog/L6o0D3JYfKHb5/JcPLiA6DFBw6Di48hm76lcUia+7s4MrTzuQvYd3JhLOmiW3Npm4EEs13Z0dvOfwgxgZ3dc/uOXF14ku05IDBt8aqvhcjQISqc3Cjja+smyhAnoayHTww/5X0Tp30TwOmBFerWtG9a+kuvqWiLSqzHb1RCr1D8a5MIv6FUUkjul4LDDzwQ+V5+yJ8wdSv6JI62pEIE/XY4GZD/7puDcWkWQ1KpCn6zkGmQ7+fP8gy3/cWxz/u+rLE//xJ7ujSGJH06o7L13tTJqlUYE82XMMkvrfznTw375xgKGRAhCM2b9940DFN7c0oK5ds6XuVkISrYxqVxib7qL3opFXO2vVHaRMvUad9DWZY4FJfivJdPD7BMsw9s3PmVFwr7uVkEQr48lXR6fNV8lagjV6L0qvdpZk+adrX6s0RyMHZ9R7LDDJbyWZDv4LF8/ntg2/Y3jUmdFmXLh4/n6PKX3zcSeXMwyvq5VQrZUxmZboe49oY2b7aMWWy2Ret9oF6qu9Xi3Bmu8f5IXdb9PelmNkZN/VzpJseZX/E63eOFCsS3S/vglky3QfnJHkt5JMB393ZwerVnxw3H/68jf/6vNOrrs/ulIrY7It0WpXGJvM61Z6Low/h1Hc1knpa7fnjOVnHMvJRx+WeB9/6d+xLWfclh9gZDQoA2aMjI6tq3YE0mxJfivJdPDDxHv9qX7zo+3dvH4737v3aQ6a0VY1MOO22CvVoVoQT/Sa+f5Bvnfv0xXPSh6vnFELfnR0X+uk0rZKyzVacI4+/CC+eMaxk3pP46g6Qd+oA16cPGz1xoHisZ/2nPG5JcdwweL52gFIUyT1rSTzwR9HPW/+eAF78/rt/NUdm4vLM9oMK/h+3T+T+SZQ6WtieWu7PNQqHXAtLVNpizma1A4Y85pfOD1owUdzIJW2pLs7OxpyUK3ae19tgj7MijssY98ObmjUuXn9vmm7a/0M3Lx+O3f1vci5i+bFOilwMnTgWmqh4E/ARKF9V9+LYx5/0rx38fGT3121ZVzPgZ1K31SuX7t13FArPeCaA85cOIcrzv6D4nYrTWp34eL5Y1rwDly7Zktx5wGwdzhoSUfBW+s3qFou4BFnh1lehuj9jm6v3jhQLH+90wiX7twfeuYVZrQZowVP5MBy0geutVNJHwV/AiYK7XMXzeOhZ14pLn/+tGP36+4o75OOWti1hn/p46PXrBZq5a3x0tCPXq93264xk9o5jHlO1GIuHSHlwG35YCcRlamW4w21XMAjbhdXpbO1I6U7uNKuq9IyTRSE5Tv3oEspmZFLSY7+0GiodFLwJ2Ci7owo5H/+2HbmvutATnz3ofu9RtQqLW9hxz24XC2cLlg8n1fe2Mu6p1+uGGoXLJ6Phb/jHOy+cPF8Llw8f78W83B4fkSYd4yO1hdItV7AY6IurjjhFe0USutV6aD5eMcAynfuM9qMQoXuvNL3LY5Kf9cku8+m65mnMjkK/gRUG71TPhHcUy+9weYXXuPBZ16u2iVR2sIeGi5w9Z19FDzoMrj6vJPZ+OwQhx439ptAnFE515w/dgdS/pwLKgxtrVa3aH2k9BKW167ZMqlAqvUCHpXK9407Nhe/5dQSXhMdNB/vGEC0c6/Wx1/+fn9t8UyWTlCeajuwyQxAmOjbi65ulU4K/jpN9A9TGhr5/kGW/+iR4vkC0RDSKEBK+8DLX7/j4JnFfzwrOYEs2gmMFpw1z/eOCZ7bS/qoq43KGXxrqHipSaitZRdnJFR0/2QPanZ31n4Bj/L3/tYNvyt2PbW1TS68erpm096WK57xXb4zKZ+GorS7bLzhr0++Olpxe6Wfs/H+RvUOQKj1eIha++mg4K9DrV0HqzcOMBT2eQyNevGgaHvOGBr1/frAy18/6t4pbUFHO4FKwVMt6Kq13KoNx5wKUzEcbWFHG0uXLtxvfZy+9t5tuxgJL6lmwGe7p2Bopu87glE68unm9duLO+OJpqEob0m/94i2ivUr/xxMZes77s4+qSGF0jwK/jrU2u9pFZa7Ozv43JJjuHn9dpygDzw6m3TH7rcrts7z/YPFPviTjz4smDdoeGwIjBd0E53oFQ3HTHrc+lSMEikdetqWM6791KKK5wNUOiYxGaXvb+nIJ4Cr7+wr3gfjT0NR3pJ+47lNFbdV/jmYyta3unGyS8Ffh1r+YfL9gzjBwb2R0eDgXtR/fsHi+WPGk5eeTdqWM3zUiy328tbfyUcfxgWL5/PiCzv46vmnVz3QVxp0E/VZRydUJR36UzFKpHfbrmJ31kjBufrOvuJB8vLRO0mGZdSVc/3arYyWhD5MPA1F6d9j3XMTb6u0TlNB3TjZpeCvQ9x/mCjk3hkuYMBpCzr4y3PfN6ZftvrZpKGwW2HMQcWSg7ztBl+to2yRRrf6pmqUSE/XbNpyVmxhF9z5wQPPsvbJncWD36UHP5MOy56u2RwwI8fQcIFczrjsrOM49KAZkwrURgRz0t049YxckuQp+GtU+kEuPThaSe+2XbwzvO8g4KPPD/LU79/Yb/x41De/euMAQ8PBpGWRkYIXt1fpIO+Is1941vLP3OhW31TtaLo7O7j2U4v27QBzxv1P7iy2uocSHHpY6f1N6n1s5f71ekYuSWMo+GtQazdFT9dschb09Ubu6nuxYl90FBzfu/dp/v2ZV4oHZ3NmxSCpNEyyzZh0a6qR4TKVAfnFM44tjhrasfttbl6/vXhf9L41UiuHdBLijlySxlPw16DWboruzg5WfLiLHzy4rbju3EXzxn38FWf/AY89/2qxy+DaTy2qOGQvCrwDdve3XNhMdddL+Tem8vdNmiPOyCVpDgV/DerpprjqE+/j2NmHFE/kmWgmyrgt4ijw1q0bqKsuaaMDldNPnJFL0hwK/hrUGy5fPGP/uXgm2o6Cq3Z636afiUYuSXMo+GukcBGRVpdrdgFERKSxFPwiIhmj4BcRyRgFv4hIxij4RUQyRsEvIpIx5u4TP6rJzOxloL/Op88BXpnwUa0r7fUD1TEtVMfG63T3I8tXtkTwT4aZbXD3Jc0uR1LSXj9QHdNCdZw+1NUjIpIxCn4RkYzJQvD/qNkFSFja6weqY1qojtNE6vv4RURkrCy0+EVEpISCX0QkY1Ib/GZ2jpk9ZWZbzeyqZpenXmb2EzPbaWZ9JeuOMLN7zOyZ8HdHuN7M7PthnX9jZoubV/J4zOwYM1trZr81sy1mdnm4Pk11PNDMHjWzTWEd/zpcf5yZrQ/r+HMzmxmuPyBc3hrev6CZ5a+FmbWZ2eNmtiZcTlUdzex5M9tsZk+Y2YZwXct9VlMZ/GbWBlwPnAucBCw3s5OaW6q63QScU7buKuA+dz8BuC9chqC+J4Q/K4B/alAZJ2ME+O/u/j6gB/hK+LdKUx33Ah9x91OAU4FzzKwH+BZwXVjHQeDS8PGXAoPuvhC4Lnxcq7gc+G3JchrruMzdTy0Zr996n1V3T90P8EHgVyXLXwe+3uxyTaI+C4C+kuWngHnh7XnAU+HtHwLLKz2uVX6AO4GPpbWOwMHARuAMgjM828P1xc8s8Cvgg+Ht9vBx1uyyx6jbfILg+wiwBrAU1vF5YE7Zupb7rKayxQ+8B/hdyfJAuC4t5rr7iwDh76PC9S1d7/Dr/geA9aSsjmEXyBPATuAe4Flgt7uPhA8prUexjuH9rwETX+C5+b4HXAkUwuXZpK+ODtxtZnkzWxGua7nPalovvWgV1mVh3GrL1tvMZgGrgSvc/XWzSlUJHlph3bSvo7uPAqea2eHAHcD7Kj0s/N1ydTSz84Cd7p43s6XR6goPbdk6hs509x1mdhRwj5k9Oc5jp20d09riHwCOKVmeD+xoUlmS8JKZzQMIf+8M17dkvc1sBkHor3T328PVqapjxN13A+sIjmccbmZR46u0HsU6hvcfBrza2JLW7Ezgk2b2PHALQXfP90hXHXH3HeHvnQQ78NNpwc9qWoP/MeCEcETBTOALwC+bXKap9Evg4vD2xQT94tH6/xyOJugBXou+gk5XFjTtbwR+6+5/V3JXmup4ZNjSx8wOAs4mOAC6Fvhs+LDyOkZ1/yxwv4edxNOVu3/d3ee7+wKC/7f73f0iUlRHMzvEzA6NbgMfB/poxc9qsw8yJHgQ5hPA0wR9qd9odnkmUY9VwIvAMEEL4lKCvtD7gGfC30eEjzWC0UzPApuBJc0uf4z6nUXw9fc3wBPhzydSVsf3A4+HdewDrg7XdwGPAluBW4EDwvUHhstbw/u7ml2HGuu7FFiTtjqGddkU/myJcqUVP6uaskFEJGPS2tUjIiJVKPhFRDJGwS8ikjEKfhGRjFHwi4hkjIJfUsXMrjOzK0qWf2VmN5Qsf9fM/mISr3+NmX1tsuWsY7unmtknGr1dSScFv6TNr4EPAZhZDpgDnFxy/4eAh+O8UDjL63RxKsH5DSKTpuCXtHmYMPgJAr8PeMPMOszsAII5ch4Pz6b8WzPrC+dX/zyAmS214PoANxOcdIOZfcOCazvcC5xYaaNmNtfM7rBgzv1NZhbtfP4i3EZf9E3EzBbY2OsrfM3MrglvrzOzb1kwf//TZvbh8Ozza4HPh/PAf37K3zXJlLRO0iYZ5cEEWiNmdizBDuARghkRP0gwA+Rv3H3IzC4kaEWfQvCt4DEzezB8mdOBRe7+nJl1E0xB8AGC/5eNQL7Cpr8PPODunwm/KcwKn3sJwRTMBqw3swcI5qUfT7u7nx527XzT3c82s6sJzvz8an3vjMg+avFLGkWt/ij4HylZ/nX4mLOAVe4+6u4vAQ8Ap4X3Peruz4W3Pwzc4e5vufvrVJ/z6SOEF9oIX/O1cBt3uPub7r4HuD18vYlEE9XlCa7FIDKlFPySRlE//x8SdPX0ErT4S/v3q877DLxZtlzvvCbVtjHC2P+9A8vu3xv+HkXfyiUBCn5Jo4eB84BXw9b3q8DhBOH/SPiYBwn6zNvM7EjgjwgmCyv3IPAZMzsonJnx/CrbvA/4UyhedOVd4XM/bWYHh7M5fgZ4CHgJOMrMZofHHc6LUac3gENjPE5kQgp+SaPNBP32vWXrXnP3V8LlOwhmy9wE3A9c6e6/L38hd98I/Jxg1tDVBMFdyeXAMjPbTNBFc3L43JsIdijrgRvc/XF3HyY4WLue4BKF413MI7IWOEkHd2UqaHZOEZGMUYtfRCRjFPwiIhmj4BcRyRgFv4hIxij4RUQyRsEvIpIxCn4RkYz5/7fLLBKYZ+BaAAAAAElFTkSuQmCC\n",
"text/plain": [
""
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"plot('Word count', W, 'Points per word', PPW)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can see all the Pareto optimal honeycombs that score more than, say, 7.6 points per word:"
]
},
{
"cell_type": "code",
"execution_count": 48,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[(1, 15, Honeycomb('BIMNRUV', 'V'), 15.0),\n",
" (2, 26, Honeycomb('DHNORTX', 'X'), 13.0),\n",
" (3, 31, Honeycomb('CILMOQU', 'Q'), 10.33),\n",
" (4, 32, Honeycomb('BGINOUX', 'X'), 8.0),\n",
" (5, 45, Honeycomb('CEGIPTX', 'G'), 9.0),\n",
" (6, 50, Honeycomb('DELNPUZ', 'Z'), 8.33),\n",
" (7, 62, Honeycomb('BGILNOX', 'X'), 8.86),\n",
" (8, 67, Honeycomb('DGINOXZ', 'X'), 8.38),\n",
" (9, 70, Honeycomb('EFNQRTU', 'Q'), 7.78),\n",
" (10, 84, Honeycomb('CENOQRU', 'Q'), 8.4),\n",
" (11, 86, Honeycomb('GINOTUV', 'V'), 7.82),\n",
" (12, 100, Honeycomb('GILMNUZ', 'Z'), 8.33),\n",
" (13, 108, Honeycomb('GINOQTU', 'Q'), 8.31),\n",
" (14, 113, Honeycomb('CINOTXY', 'X'), 8.07),\n",
" (15, 115, Honeycomb('DGINOXZ', 'Z'), 7.67),\n",
" (19, 157, Honeycomb('DEIORXZ', 'X'), 8.26),\n",
" (22, 172, Honeycomb('DEGINPZ', 'Z'), 7.82),\n",
" (23, 184, Honeycomb('ACELQRU', 'Q'), 8.0),\n",
" (26, 198, Honeycomb('AILNOTZ', 'Z'), 7.62),\n",
" (28, 224, Honeycomb('DEGINRZ', 'Z'), 8.0),\n",
" (45, 374, Honeycomb('ACINOTV', 'V'), 8.31),\n",
" (403, 3095, Honeycomb('AEGINRT', 'G'), 7.68),\n",
" (442, 3406, Honeycomb('AEGINRT', 'I'), 7.71)]"
]
},
"execution_count": 48,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"[entry for entry in ph if entry[-1] > 7.6]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The last few honeycombs on the right-hand side all rise above the average points/word. We can see that they are all variants of the highest-scoring honeycomb, but with different centers:"
]
},
{
"cell_type": "code",
"execution_count": 49,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[(403, 3095, Honeycomb('AEGINRT', 'G'), 7.68),\n",
" (442, 3406, Honeycomb('AEGINRT', 'I'), 7.71),\n",
" (466, 3421, Honeycomb('AEGINRT', 'T'), 7.34),\n",
" (512, 3782, Honeycomb('AEGINRT', 'N'), 7.39),\n",
" (537, 3898, Honeycomb('AEGINRT', 'R'), 7.26)]"
]
},
"execution_count": 49,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ph[-5:]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Here are reports on what I think are the most interesting low-word-count, higher-score honeycombs. I would have scored zero on the first one, and probably not much better on the second."
]
},
{
"cell_type": "code",
"execution_count": 50,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The Honeycomb('CEGIPTX', 'G') scores 45 points on 5 words from a 44585 word list:\n",
"\n",
"CEGIPTX 17 points 1 pangram EPEXEGETIC:17\n",
" CEGITX 8 points 1 word EXEGETIC:8\n",
" CEGIP 7 points 1 word EPIGEIC:7\n",
" EGIP 6 points 1 word PIGGIE:6\n",
" EGTX 7 points 1 word EXEGETE:7\n"
]
}
],
"source": [
"report(Honeycomb('CEGIPTX', 'G'))"
]
},
{
"cell_type": "code",
"execution_count": 51,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The Honeycomb('DEIORXZ', 'X') scores 157 points on 19 words from a 44585 word list:\n",
"\n",
"DEIORXZ 65 points 4 pangrams DEOXIDIZER:17 OXIDIZER:15 REOXIDIZE:16 REOXIDIZED:17\n",
" DEIOXZ 34 points 4 words DEOXIDIZE:9 DEOXIDIZED:10 OXIDIZE:7 OXIDIZED:8\n",
" DEIOX 23 points 4 words DIOXIDE:7 DOXIE:5 EXODOI:6 OXIDE:5\n",
" DEORX 12 points 2 words REDOX:5 XEROXED:7\n",
" DEIX 5 points 1 word DEXIE:5\n",
" DIOX 13 points 3 words DIOXID:6 IXODID:6 OXID:1\n",
" EORX 5 points 1 word XEROX:5\n"
]
}
],
"source": [
"report(Honeycomb('DEIORXZ', 'X'))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The following I think are decent puzzles:"
]
},
{
"cell_type": "code",
"execution_count": 52,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The Honeycomb('ACINOTV', 'V') scores 374 points on 45 words from a 44585 word list:\n",
"\n",
"ACINOTV 171 points 10 pangrams ACTIVATION:17 AVOCATION:16 CAVITATION:17 CONVOCATION:18 INACTIVATION:19\n",
" INVOCATION:17 VACATION:15 VACCINATION:18 VATICINATION:19 VOCATION:15\n",
" ACINOV 7 points 1 word AVIONIC:7\n",
" ACINTV 8 points 1 word CAVATINA:8\n",
" AINOTV 62 points 7 words AVIATION:8 INNOVATION:10 INVITATION:10 NOVATION:8 OVATION:7 TITIVATION:10\n",
" VITIATION:9\n",
" CINOTV 17 points 2 words CONVICT:7 CONVICTION:10\n",
" ACINV 20 points 3 words VACCINA:7 VACCINIA:8 VINCA:5\n",
" ACITV 24 points 4 words ATAVIC:6 VATIC:5 VIATIC:6 VIATICA:7\n",
" ACNTV 6 points 1 word VACANT:6\n",
" ACOTV 6 points 1 word OCTAVO:6\n",
" AINOV 5 points 1 word AVION:5\n",
" CINOV 11 points 2 words COVIN:5 OVONIC:6\n",
" AINV 7 points 3 words AVIAN:5 VAIN:1 VINA:1\n",
" AITV 6 points 2 words VITA:1 VITTA:5\n",
" ANOV 1 point 1 word NOVA:1\n",
" ANTV 5 points 1 word AVANT:5\n",
" AOTV 6 points 1 word OTTAVA:6\n",
" CINV 5 points 1 word VINIC:5\n",
" INOV 1 point 1 word VINO:1\n",
" AIV 1 point 1 word VIVA:1\n",
" CIV 5 points 1 word CIVIC:5\n"
]
}
],
"source": [
"report(Honeycomb('ACINOTV', 'V'))"
]
},
{
"cell_type": "code",
"execution_count": 53,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The Honeycomb('ACINOTU', 'U') scores 385 points on 55 words from a 44585 word list:\n",
"\n",
"ACINOTU 162 points 10 pangrams ACTUATION:16 ANNUNCIATION:19 AUCTION:14 CAUTION:14 CONTINUA:15 CONTINUANT:17\n",
" CONTINUATION:19 COUNTIAN:15 CUNCTATION:17 INCAUTION:16\n",
" ACINTU 6 points 1 word TUNICA:6\n",
" ACNOTU 31 points 4 words ACCOUNT:7 ACCOUNTANT:10 COCOANUT:8 TOUCAN:6\n",
" AINOTU 17 points 2 words ANTIUNION:9 NUTATION:8\n",
" CINOTU 24 points 3 words CONTINUO:8 INUNCTION:9 UNCTION:7\n",
" ACINU 5 points 1 word UNCIA:5\n",
" ACOTU 6 points 1 word OUTACT:6\n",
" AINTU 9 points 1 word ANNUITANT:9\n",
" CINOU 13 points 2 words INCONNU:7 NUNCIO:6\n",
" CINTU 10 points 2 words CUTIN:5 TUNIC:5\n",
" CNOTU 20 points 3 words COCONUT:7 COUNT:5 OUTCOUNT:8\n",
" INOTU 16 points 2 words INTUITION:9 TUITION:7\n",
" AINU 1 point 1 word UNAI:1\n",
" ANTU 13 points 4 words AUNT:1 NUTANT:6 TAUNT:5 TUNA:1\n",
" AOTU 1 point 1 word AUTO:1\n",
" CINU 7 points 2 words UNCI:1 UNCINI:6\n",
" CNOU 1 point 1 word UNCO:1\n",
" CNTU 6 points 2 words CUNT:1 UNCUT:5\n",
" COTU 6 points 1 word CUTOUT:6\n",
" INOU 13 points 2 words NONUNION:8 UNION:5\n",
" INTU 7 points 2 words INTUIT:6 UNIT:1\n",
" NOTU 1 point 1 word UNTO:1\n",
" ANU 1 point 1 word UNAU:1\n",
" ATU 1 point 1 word TAUT:1\n",
" ITU 5 points 1 word TUTTI:5\n",
" NOU 1 point 1 word NOUN:1\n",
" OTU 1 point 1 word TOUT:1\n",
" TU 1 point 1 word TUTU:1\n"
]
}
],
"source": [
"report(Honeycomb('ACINOTU', 'U'))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Step 11: S Words\n",
"\n",
"What if we allowed honeycombs and words to have an 'S' in them?"
]
},
{
"cell_type": "code",
"execution_count": 54,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(98141, 44585)"
]
},
"execution_count": 54,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"enable1s = valid_words(open('enable1.txt').read(), \n",
" lambda w: len(w) >= 4 and len(set(w)) <= 7)\n",
"\n",
"len(enable1s), len(enable1)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Allowing 'S' more than doubles the number of words. Will it double the score of the best honeycomb?"
]
},
{
"cell_type": "code",
"execution_count": 55,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The best Honeycomb('AEINRST', 'E') scores 8681 points on 1179 words from a 98141 word list:\n",
"\n",
"AEINRST 1381 points 86 pangrams ANESTRI:14 ANTISERA:15 ANTISTRESS:17 ANTSIER:14 ARENITES:15 ARSENITE:15\n",
" ARSENITES:16 ARTINESS:15 ARTINESSES:17 ATTAINERS:16 ENTERTAINERS:19 ENTERTAINS:17 ENTRAINERS:17\n",
" ENTRAINS:15 ENTREATIES:17 ERRANTRIES:17 INERTIAS:15 INSTANTER:16 INTENERATES:18 INTERSTATE:17\n",
" INTERSTATES:18 INTERSTRAIN:18 INTERSTRAINS:19 INTRASTATE:17 INTREATS:15 IRATENESS:16 IRATENESSES:18\n",
" ITINERANTS:17 ITINERARIES:18 ITINERATES:17 NASTIER:14 NITRATES:15 RAINIEST:15 RATANIES:15 RATINES:14\n",
" REATTAINS:16 REINITIATES:18 REINSTATE:16 REINSTATES:17 RESINATE:15 RESINATES:16 RESISTANT:16\n",
" RESISTANTS:17 RESTRAIN:15 RESTRAINER:17 RESTRAINERS:18 RESTRAINS:16 RESTRAINT:16 RESTRAINTS:17\n",
" RETAINERS:16 RETAINS:14 RETINAS:14 RETIRANTS:16 RETRAINS:15 RETSINA:14 RETSINAS:15 SANITARIES:17\n",
" SEATRAIN:15 SEATRAINS:16 STAINER:14 STAINERS:15 STANNARIES:17 STEARIN:14 STEARINE:15 STEARINES:16\n",
" STEARINS:15 STRAINER:15 STRAINERS:16 STRAITEN:15 STRAITENS:16 STRAITNESS:17 STRAITNESSES:19\n",
" TANISTRIES:17 TANNERIES:16 TEARSTAIN:16 TEARSTAINS:17 TENANTRIES:17 TERNARIES:16 TERRAINS:15\n",
" TERTIANS:15 TRAINEES:15 TRAINERS:15 TRANSIENT:16 TRANSIENTS:17 TRISTEARIN:17 TRISTEARINS:18\n",
" AEINRS 124 points 16 words AIRINESS:8 AIRINESSES:10 ANSERINE:8 ANSERINES:9 ARISEN:6 ARSINE:6 ARSINES:7\n",
" INSANER:7 INSNARE:7 INSNARER:8 INSNARERS:9 INSNARES:8 SENARII:7 SIERRAN:7 SIRENIAN:8 SIRENIANS:9\n",
" AEINRT 232 points 30 words ARENITE:7 ATTAINER:8 ENTERTAIN:9 ENTERTAINER:11 ENTRAIN:7 ENTRAINER:9 INERRANT:8\n",
" INERTIA:7 INERTIAE:8 INTENERATE:10 INTREAT:7 ITERANT:7 ITINERANT:9 ITINERATE:9 NATTIER:7 NITRATE:7\n",
" RATINE:6 REATTAIN:8 REINITIATE:10 RETAIN:6 RETAINER:8 RETINA:6 RETINAE:7 RETIRANT:8 RETRAIN:7\n",
" TERRAIN:7 TERTIAN:7 TRAINEE:7 TRAINER:7 TRIENNIA:8\n",
" AEINST 713 points 80 words ANISETTE:8 ANISETTES:9 ANTISENSE:9 ANTISTATE:9 ANTSIEST:8 ASININITIES:11\n",
" ASSASSINATE:11 ASSASSINATES:12 ASTATINE:8 ASTATINES:9 ENTASIA:7 ENTASIAS:8 ENTASIS:7 ETESIAN:7\n",
" ETESIANS:8 INANEST:7 INANITIES:9 INITIATES:9 INNATENESS:10 INNATENESSES:12 INSANEST:8 INSANITIES:10\n",
" INSATIATE:9 INSATIATENESS:13 INSATIATENESSES:15 INSENSATE:9 INSTANTANEITIES:15 INSTANTIATE:11\n",
" INSTANTIATES:12 INSTANTNESS:11 INSTANTNESSES:13 INSTATE:7 INSTATES:8 INTESTATE:9 INTESTATES:10\n",
" ISATINE:7 ISATINES:8 NASTIES:7 NASTIEST:8 NASTINESS:9 NASTINESSES:11 NATTIEST:8 NATTINESS:9\n",
" NATTINESSES:11 SANITATE:8 SANITATES:9 SANITIES:8 SANITISE:8 SANITISES:9 SATINET:7 SATINETS:8\n",
" SENTENTIA:9 SENTENTIAE:10 SESTINA:7 SESTINAS:8 STANINE:7 STANINES:8 STANNITE:8 STANNITES:9 TAENIAS:7\n",
" TAENIASES:9 TAENIASIS:9 TANSIES:7 TASTINESS:9 TASTINESSES:11 TATTINESS:9 TATTINESSES:11 TENIAS:6\n",
" TENIASES:8 TENIASIS:8 TETANIES:8 TETANISE:8 TETANISES:9 TINEAS:6 TISANE:6 TISANES:7 TITANATES:9\n",
" TITANESS:8 TITANESSES:10 TITANITES:9\n",
" AEIRST 473 points 60 words AERIEST:7 AIREST:6 AIRIEST:7 ARIETTAS:8 ARIETTES:8 ARISTAE:7 ARISTATE:8 ARTERIES:8\n",
" ARTERITIS:9 ARTIEST:7 ARTISTE:7 ARTISTES:8 ARTISTRIES:10 ARTSIER:7 ARTSIEST:8 ASSISTER:8 ASSISTERS:9\n",
" ASTERIA:7 ASTERIAS:8 ATRESIA:7 ATRESIAS:8 ATTIRES:7 EATERIES:8 IRATEST:7 IRRITATES:9 ITERATES:8\n",
" RARITIES:8 RATITES:7 RATTIEST:8 REITERATES:10 SATIRE:6 SATIRES:7 SATIRISE:8 SATIRISES:9 SERIATE:7\n",
" SERIATES:8 SESTERTIA:9 STARRIER:8 STARRIEST:9 STRAITER:8 STRAITEST:9 STRIAE:6 STRIATE:7 STRIATES:8\n",
" TARRIERS:8 TARRIES:7 TARRIEST:8 TARSIER:7 TARSIERS:8 TASTIER:7 TEARIEST:8 TERAIS:6 TERTIARIES:10\n",
" TITRATES:8 TRAITRESS:9 TRAITRESSES:11 TREATIES:8 TREATISE:8 TREATISES:9 TRISTATE:8\n",
" AENRST 336 points 40 words ANTEATERS:9 ANTRES:6 ARRESTANT:9 ARRESTANTS:10 ARSENATE:8 ARSENATES:9 ASSENTER:8\n",
" ASSENTERS:9 ASTERN:6 EARNEST:7 EARNESTNESS:11 EARNESTNESSES:13 EARNESTS:8 EASTERN:7 EASTERNER:9\n",
" EASTERNERS:10 ENTRANTS:8 ENTREATS:8 ERRANTS:7 NARRATERS:9 NARRATES:8 NATTERS:7 NEAREST:7 RANTERS:7\n",
" RATTEENS:8 RATTENERS:9 RATTENS:7 REENTRANTS:10 RETREATANTS:11 SARSENET:8 SARSENETS:9 SERENATA:8\n",
" SERENATAS:9 SERENATE:8 STERNA:6 TANNERS:7 TARANTASES:10 TARTNESS:8 TARTNESSES:10 TERRANES:8\n",
" EINRST 582 points 70 words ENTERITIS:9 ENTERITISES:11 ENTIRENESS:10 ENTIRENESSES:12 ENTIRES:7 ENTIRETIES:10\n",
" ENTRIES:7 ESTRIN:6 ESTRINS:7 ETERNISE:8 ETERNISES:9 ETERNITIES:10 INERTNESS:9 INERTNESSES:11 INERTS:6\n",
" INSERT:6 INSERTER:8 INSERTERS:9 INSERTS:7 INSETTER:8 INSETTERS:9 INSISTER:8 INSISTERS:9 INTENSER:8\n",
" INTEREST:8 INTERESTS:9 INTERNEES:9 INTERNES:8 INTERNIST:9 INTERNISTS:10 INTERNS:7 INTERS:6 INTERTIES:9\n",
" NITERIES:8 NITERS:6 NITRES:6 NITRITES:8 REENTRIES:9 REINSERT:8 REINSERTS:9 REINTERS:8 RENTIERS:8\n",
" RETINENES:9 RETINES:7 RETINITES:9 RETINITIS:9 RETINTS:7 SENTRIES:8 SERENITIES:10 SINISTER:8\n",
" SINISTERNESS:12 SINISTERNESSES:14 SINTER:6 SINTERS:7 STERNITE:8 STERNITES:9 STINTER:7 STINTERS:8\n",
" TEENSIER:8 TEENTSIER:9 TERRINES:8 TINNERS:7 TINTERS:7 TRIENES:7 TRIENS:6 TRIENTES:8 TRINES:6\n",
" TRINITIES:9 TRITENESS:9 TRITENESSES:11\n",
" AEINR 19 points 3 words INANER:6 NARINE:6 RAINIER:7\n",
" AEINS 129 points 17 words ANISE:5 ANISES:6 ASININE:7 EASINESS:8 EASINESSES:10 INANENESS:9 INANENESSES:11\n",
" INANES:6 INSANE:6 INSANENESS:10 INSANENESSES:12 NANNIES:7 SANIES:6 SANSEI:6 SANSEIS:7 SIENNA:6\n",
" SIENNAS:7\n",
" AEINT 64 points 10 words ENTIA:5 INITIATE:8 INNATE:6 TAENIA:6 TAENIAE:7 TENIA:5 TENIAE:6 TINEA:5 TITANATE:8\n",
" TITANITE:8\n",
" AEIRS 106 points 17 words AERIES:6 AIRERS:6 ARISE:5 ARISES:6 ARRISES:7 EASIER:6 RAISE:5 RAISER:6 RAISERS:7\n",
" RAISES:6 RERAISE:7 RERAISES:8 SASSIER:7 SERAI:5 SERAIS:6 SIERRA:6 SIERRAS:7\n",
" AEIRT 135 points 20 words ARIETTA:7 ARIETTE:7 ARTIER:6 ATTIRE:6 ATTRITE:7 IRATE:5 IRATER:6 IRRITATE:8\n",
" ITERATE:7 RATITE:6 RATTIER:7 REITERATE:9 RETIA:5 RETIARII:8 TARRIER:7 TATTIER:7 TEARIER:7 TERAI:5\n",
" TERRARIA:8 TITRATE:7\n",
" AEIST 112 points 15 words EASIEST:7 ETATIST:7 SASSIEST:8 SATIATE:7 SATIATES:8 SATIETIES:9 SIESTA:6 SIESTAS:7\n",
" STEATITE:8 STEATITES:9 TASSIE:6 TASSIES:7 TASTIEST:8 TATTIES:7 TATTIEST:8\n",
" AENRS 172 points 25 words ANEARS:6 ARENAS:6 EARNERS:7 EARNS:5 ENSNARE:7 ENSNARER:8 ENSNARERS:9 ENSNARES:8\n",
" NARES:5 NEARNESS:8 NEARNESSES:10 NEARS:5 RANEES:6 RARENESS:8 RARENESSES:10 REEARNS:7 RENNASE:7\n",
" RENNASES:8 SANER:5 SARSEN:6 SARSENS:7 SNARE:5 SNARER:6 SNARERS:7 SNARES:6\n",
" AENRT 132 points 19 words ANTEATER:8 ANTRE:5 ENTERA:6 ENTRANT:7 ENTREAT:7 ERRANT:6 NARRATE:7 NARRATER:8\n",
" NATTER:6 NEATER:6 RANTER:6 RATTEEN:7 RATTEN:6 RATTENER:8 REENTRANT:9 RETREATANT:10 TANNER:6 TERNATE:7\n",
" TERRANE:7\n",
" AENST 217 points 32 words ANATASE:7 ANATASES:8 ANENST:6 ANNATES:7 ANSATE:6 ANTENNAS:8 ANTES:5 ASSENT:6\n",
" ASSENTS:7 ENATES:6 ENTASES:7 ETNAS:5 NATES:5 NEATENS:7 NEATEST:7 NEATNESS:8 NEATNESSES:10 NEATS:5\n",
" SANEST:6 SATEEN:6 SATEENS:7 SENATE:6 SENATES:7 SENSATE:7 SENSATES:8 SETENANT:8 SETENANTS:9 STANE:5\n",
" STANES:6 TANNATES:8 TANNEST:7 TENANTS:7\n",
" AERST 604 points 85 words AERATES:7 ARETES:6 ARREST:6 ARRESTEE:8 ARRESTEES:9 ARRESTER:8 ARRESTERS:9\n",
" ARRESTS:7 ASSERT:6 ASSERTER:8 ASSERTERS:9 ASSERTS:7 ASTER:5 ASTERS:6 ATTESTER:8 ATTESTERS:9 EASTER:6\n",
" EASTERS:7 EATERS:6 ERRATAS:7 ESTERASE:8 ESTERASES:9 ESTREAT:7 ESTREATS:8 RAREST:6 RASTER:6 RASTERS:7\n",
" RATERS:6 RATES:5 RATTERS:7 REARREST:8 REARRESTS:9 REASSERT:8 REASSERTS:9 REATAS:6 RESEAT:6 RESEATS:7\n",
" RESTART:7 RESTARTS:8 RESTATE:7 RESTATES:8 RETASTE:7 RETASTES:8 RETEARS:7 RETREATERS:10 RETREATS:8\n",
" SEAREST:7 SEATER:6 SEATERS:7 SERRATE:7 SERRATES:8 STARE:5 STARER:6 STARERS:7 STARES:6 STARETS:7\n",
" STARTER:7 STARTERS:8 STATER:6 STATERS:7 STEARATE:8 STEARATES:9 STRASSES:8 STRETTA:7 STRETTAS:8 TARES:5\n",
" TARRES:6 TARTEST:7 TARTRATES:9 TASTER:6 TASTERS:7 TATERS:6 TATTERS:7 TEARERS:7 TEARS:5 TEASER:6\n",
" TEASERS:7 TERRAS:6 TERRASES:8 TESSERA:7 TESSERAE:8 TETRAS:6 TRASSES:7 TREATERS:8 TREATS:6\n",
" EINRS 184 points 29 words EERINESS:8 EERINESSES:10 ESERINE:7 ESERINES:8 INNERS:6 NEREIS:6 REINS:5 RENINS:6\n",
" RENNINS:7 RERISEN:7 RESIN:5 RESINS:6 RINSE:5 RINSER:6 RINSERS:7 RINSES:6 RISEN:5 SEINER:6 SEINERS:7\n",
" SEREIN:6 SEREINS:7 SERIN:5 SERINE:6 SERINES:7 SERINS:6 SINNER:6 SINNERS:7 SIREN:5 SIRENS:6\n",
" EINRT 190 points 29 words ENTIRE:6 INERT:5 INTER:5 INTERN:6 INTERNE:7 INTERNEE:8 INTERTIE:8 NETTIER:7\n",
" NITER:5 NITERIE:7 NITRE:5 NITRITE:7 NITTIER:7 REINTER:7 RENITENT:8 RENTIER:7 RETINE:6 RETINENE:8\n",
" RETINITE:8 RETINT:6 TEENIER:7 TENTIER:7 TERRINE:7 TINIER:6 TINNER:6 TINNIER:7 TINTER:6 TRIENE:6\n",
" TRINE:5\n",
" EINST 469 points 58 words EINSTEIN:8 EINSTEINS:9 ENTITIES:8 INSENTIENT:10 INSET:5 INSETS:6 INSISTENT:9\n",
" INTENSE:7 INTENSENESS:11 INTENSENESSES:13 INTENSEST:9 INTENSITIES:11 INTENTNESS:10 INTENTNESSES:12\n",
" INTENTS:7 INTESTINE:9 INTESTINES:10 INTINES:7 NEIST:5 NETTIEST:8 NINETEENS:9 NINETIES:8 NITES:5\n",
" NITTIEST:8 SENITI:6 SENNIT:6 SENNITS:7 SENSITISE:9 SENSITISES:10 SENTI:5 SENTIENT:8 SENTIENTS:9\n",
" SESTINE:7 SESTINES:8 SIENITE:7 SIENITES:8 SITTEN:6 STEIN:5 STEINS:6 TEENIEST:8 TEENSIEST:9\n",
" TEENTSIEST:10 TENNIES:7 TENNIS:6 TENNISES:8 TENNIST:7 TENNISTS:8 TENSITIES:9 TENTIEST:8 TESTINESS:9\n",
" TESTINESSES:11 TINES:5 TINIEST:7 TININESS:8 TININESSES:10 TINNIEST:8 TINNINESS:9 TINNINESSES:11\n",
" EIRST 262 points 38 words EERIEST:7 IRITISES:8 RESIST:6 RESISTER:8 RESISTERS:9 RESISTS:7 RESITE:6 RESITES:7\n",
" RETIES:6 RETIREES:8 RETIRERS:8 RETIRES:7 RETRIES:7 RITES:5 RITTERS:7 SISTER:6 SISTERS:7 SITTER:6\n",
" SITTERS:7 STIRRER:7 STIRRERS:8 STRETTI:7 TERRIERS:8 TERRIES:7 TERRITS:7 TESTIER:7 TIERS:5 TIRES:5\n",
" TITERS:6 TITRES:6 TITTERERS:9 TITTERS:7 TRESSIER:8 TRESSIEST:9 TRIERS:6 TRIES:5 TRISTE:6 TRITEST:7\n",
" ENRST 246 points 35 words ENTERERS:8 ENTERS:6 ENTREES:7 NERTS:5 NESTER:6 NESTERS:7 NETTERS:7 REENTERS:8\n",
" RENEST:6 RENESTS:7 RENNETS:7 RENTERS:7 RENTES:6 RENTS:5 RESENT:6 RESENTS:7 RETENES:7 SERENEST:8\n",
" STERN:5 STERNER:7 STERNEST:8 STERNNESS:9 STERNNESSES:11 STERNS:6 TEENERS:7 TENNERS:7 TENSER:6\n",
" TENTERS:7 TERNES:6 TERNS:5 TERREENS:8 TERRENES:8 TERSENESS:9 TERSENESSES:11 TREENS:6\n",
" AEIN 11 points 2 words INANE:5 NANNIE:6\n",
" AEIR 22 points 4 words AERIE:5 AERIER:6 AIRER:5 AIRIER:6\n",
" AEIS 13 points 2 words EASIES:6 SASSIES:7\n",
" AEIT 6 points 1 word TATTIE:6\n",
" AENR 40 points 9 words ANEAR:5 ARENA:5 EARN:1 EARNER:6 NEAR:1 NEARER:6 RANEE:5 REEARN:6 RERAN:5\n",
" AENS 46 points 9 words ANES:1 ANSAE:5 SANE:1 SANENESS:8 SANENESSES:10 SANES:5 SENNA:5 SENNAS:6 SENSA:5\n",
" AENT 63 points 13 words ANENT:5 ANTAE:5 ANTE:1 ANTENNA:7 ANTENNAE:8 ATTENT:6 EATEN:5 ENATE:5 ETNA:1 NEAT:1\n",
" NEATEN:6 TANNATE:7 TENANT:6\n",
" AERS 121 points 26 words AREAS:5 ARES:1 ARREARS:7 ARSE:1 ARSES:5 EARS:1 ERAS:1 ERASE:5 ERASER:6 ERASERS:7\n",
" ERASES:6 RARES:5 RASE:1 RASER:5 RASERS:6 RASES:5 REARERS:7 REARS:5 REASSESS:8 REASSESSES:10 SAREE:5\n",
" SAREES:6 SEAR:1 SEARER:6 SEARS:5 SERA:1\n",
" AERT 127 points 24 words AERATE:6 ARETE:5 EATER:5 ERRATA:6 RATE:1 RATER:5 RATTER:6 REATA:5 RETEAR:6\n",
" RETREAT:7 RETREATER:9 TARE:1 TARRE:5 TARTER:6 TARTRATE:8 TATER:5 TATTER:6 TEAR:1 TEARER:6 TERRA:5\n",
" TERRAE:6 TETRA:5 TREAT:5 TREATER:7\n",
" AEST 164 points 35 words ASSET:5 ASSETS:6 ATES:1 ATTEST:6 ATTESTS:7 EAST:1 EASTS:5 EATS:1 ESTATE:6\n",
" ESTATES:7 ETAS:1 SATE:1 SATES:5 SEAT:1 SEATS:5 SETA:1 SETAE:5 STASES:6 STATE:5 STATES:6 TASSE:5\n",
" TASSES:6 TASSET:6 TASSETS:7 TASTE:5 TASTES:6 TATES:5 TEAS:1 TEASE:5 TEASES:6 TEATS:5 TESTA:5 TESTAE:6\n",
" TESTATE:7 TESTATES:8\n",
" EINR 17 points 4 words INNER:5 REIN:1 RENIN:5 RENNIN:6\n",
" EINS 53 points 10 words NINES:5 NINNIES:7 NISEI:5 NISEIS:6 SEINE:5 SEINES:6 SEISIN:6 SEISINS:7 SINE:1\n",
" SINES:5\n",
" EINT 28 points 6 words INTENT:6 INTINE:6 NINETEEN:8 NITE:1 TENTIE:6 TINE:1\n",
" EIRS 101 points 20 words IRES:1 IRISES:6 REIS:1 RERISE:6 RERISES:7 RISE:1 RISER:5 RISERS:6 RISES:5 SEISER:6\n",
" SEISERS:7 SERIES:6 SERRIES:7 SIRE:1 SIREE:5 SIREES:6 SIRES:5 SIRREE:6 SIRREES:7 SISSIER:7\n",
" EIRT 87 points 17 words RETIE:5 RETIRE:6 RETIREE:7 RETIRER:7 RITE:1 RITTER:6 TERRIER:7 TERRIT:6 TIER:1\n",
" TIRE:1 TITER:5 TITRE:5 TITTER:6 TITTERER:8 TRIER:5 TRITE:5 TRITER:6\n",
" EIST 41 points 8 words SISSIEST:8 SITE:1 SITES:5 STIES:5 TESTIEST:8 TESTIS:6 TIES:1 TITTIES:7\n",
" ENRS 80 points 12 words ERNES:5 ERNS:1 RESEEN:6 SERENE:6 SERENENESS:10 SERENENESSES:12 SERENER:7 SERENES:7\n",
" SNEER:5 SNEERER:7 SNEERERS:8 SNEERS:6\n",
" ENRT 104 points 19 words ENTER:5 ENTERER:7 ENTREE:6 ETERNE:6 NETTER:6 REENTER:7 RENNET:6 RENT:1 RENTE:5\n",
" RENTER:6 RETENE:6 TEENER:6 TENNER:6 TENTER:6 TERN:1 TERNE:5 TERREEN:7 TERRENE:7 TREEN:5\n",
" ENST 94 points 18 words ENTENTES:8 NEST:1 NESTS:5 NETS:1 NETTS:5 SENNET:6 SENNETS:7 SENT:1 SENTE:5 TEENS:5\n",
" TENETS:6 TENS:1 TENSE:5 TENSENESS:9 TENSENESSES:11 TENSES:6 TENSEST:7 TENTS:5\n",
" ERST 266 points 44 words ERST:1 ESTER:5 ESTERS:6 REEST:5 REESTS:6 RESET:5 RESETS:6 RESETTER:8 RESETTERS:9\n",
" REST:1 RESTER:6 RESTERS:7 RESTRESS:8 RESTRESSES:10 RESTS:5 RETEST:6 RETESTS:7 RETS:1 SEREST:6 SETTER:6\n",
" SETTERS:7 STEER:5 STEERER:7 STEERERS:8 STEERS:6 STERE:5 STERES:6 STREET:6 STREETS:7 STRESS:6\n",
" STRESSES:8 STRETTE:7 TEETERS:7 TERRETS:7 TERSE:5 TERSER:6 TERSEST:7 TESTER:6 TESTERS:7 TETTERS:7\n",
" TREES:5 TRESS:5 TRESSES:7 TRETS:5\n",
" AER 25 points 7 words AREA:1 AREAE:5 ARREAR:6 RARE:1 RARER:5 REAR:1 REARER:6\n",
" AES 33 points 8 words ASEA:1 ASSES:5 ASSESS:6 ASSESSES:8 EASE:1 EASES:5 SASSES:6 SEAS:1\n",
" AET 2 points 2 words TATE:1 TEAT:1\n",
" EIN 1 point 1 word NINE:1\n",
" EIR 11 points 2 words EERIE:5 EERIER:6\n",
" EIS 35 points 7 words ISSEI:5 ISSEIS:6 SEIS:1 SEISE:5 SEISES:6 SISES:5 SISSIES:7\n",
" EIT 6 points 1 word TITTIE:6\n",
" ENR 1 point 1 word ERNE:1\n",
" ENS 20 points 6 words NESS:1 NESSES:6 SEEN:1 SENE:1 SENSE:5 SENSES:6\n",
" ENT 15 points 5 words ENTENTE:7 NETT:1 TEEN:1 TENET:5 TENT:1\n",
" ERS 52 points 13 words ERRS:1 ERSES:5 REES:1 RESEE:5 RESEES:6 SEER:1 SEERESS:7 SEERESSES:9 SEERS:5 SERE:1\n",
" SERER:5 SERES:5 SERS:1\n",
" ERT 27 points 7 words RETE:1 TEETER:6 TERETE:6 TERRET:6 TETTER:6 TREE:1 TRET:1\n",
" EST 79 points 18 words SESTET:6 SESTETS:7 SETS:1 SETT:1 SETTEE:6 SETTEES:7 SETTS:5 STET:1 STETS:5 TEES:1\n",
" TEST:1 TESTEE:6 TESTEES:7 TESTES:6 TESTS:5 TETS:1 TSETSE:6 TSETSES:7\n",
" EN 1 point 1 word NENE:1\n",
" ES 7 points 3 words ESES:1 ESSES:5 SEES:1\n"
]
}
],
"source": [
"report(words=enable1s)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Yes it does (roughly) double the score!\n",
"\n",
"# Summary\n",
"\n",
"This notebook showed how to find the highest-scoring honeycomb. Thanks to a series of ideas, we were able to achieve a substantial reduction in the number of honeycombs that need to be examined (a factor of 400), the run time needed for `game_score` (a factor of about 200), and the overall run time (a factor of about 70,000).\n",
"\n",
"- **Brute Force Enumeration** (3,364,900 honeycombs; 10 hours (estimate) run time) Try every possible honeycomb.\n",
"- **Pangram Lettersets** (55,902 honeycombs; 10 minutes (estimate) run time) Try just the honeycombs that are pangram lettersets (with every center).\n",
"- **Points Table** (55,902 honeycombs; under 2 seconds run time) Precompute the score for each letterset, and sum the 64 letter subsets of each honeycomb.\n",
"- **Branch and Bound** (8,084 honeycombs; under 1/2 second run time) Try every center only for lettersets that score better than the best score so far.\n",
"\n",
"\n",
"\n",
"Here are pictures for the highest-scoring honeycombs, with and without an S:\n",
"\n",
"\n",
"