Add files via upload

2020-01-24 19:49:20 -08:00 · 2020-01-24 19:49:20 -08:00 · c965960fc6
commit c965960fc6
parent 3c91a097fe
1 changed files with 282 additions and 195 deletions
--- a/ipynb/SpellingBee.ipynb
+++ b/ipynb/SpellingBee.ipynb
@ -8,17 +8,19 @@
    "\n",
    "# Spelling Bee Puzzle\n",
    "\n",
-    "The [Jan. 3 2020 Riddler](https://fivethirtyeight.com/features/can-you-solve-the-vexing-vexillology/) concerns the popular NYTimes  [Spelling Bee](https://www.nytimes.com/puzzles/spelling-bee) puzzle:\n",
+    "The [Jan. 3 2020 Riddler](https://fivethirtyeight.com/features/can-you-solve-the-vexing-vexillology/) is about the popular NY Times  [Spelling Bee](https://www.nytimes.com/puzzles/spelling-bee) puzzle:\n",
    "\n",
-    "*In this game, seven letters are arranged in a honeycomb lattice, with one letter in the center. Here’s the lattice from December 24, 2019:*\n",
+    "*In this game, seven letters are arranged in a honeycomb lattice, with one letter in the center. Here’s the lattice from Dec. 24, 2019:*\n",
    "\n",
    "<img src=\"https://fivethirtyeight.com/wp-content/uploads/2020/01/Screen-Shot-2019-12-24-at-5.46.55-PM.png?w=1136\" width=\"150\">\n",
    "\n",
    "<img src=\"https://fivethirtyeight.com/wp-content/uploads/2020/01/Screen-Shot-2019-12-24-at-5.46.55-PM.png?w=1136\" width=\"150\" style=\"float:left;width:150px;height:150px;\">\n",
    "\n",
    "*The goal is to identify as many words that meet the following criteria:*\n",
-    "1. *The word must be at least four letters long.*\n",
+    "\n",
-    "2. *The word must include the central letter.*\n",
+    " (1) *The word must be at least four letters long.*\n",
-    "3. *The word cannot include any letter beyond the seven given letters.*\n",
+    " \n",
    " (2) *The word must include the central letter.*\n",
    " \n",
    " (3) *The word cannot include any letter beyond the seven given letters.*\n",
    "\n",
    "*Note that letters can be repeated. For example, the words GAME and AMALGAM are both acceptable words. Four-letter words are worth 1 point each, while five-letter words are worth 5 points, six-letter words are worth 6 points, seven-letter words are worth 7 points, etc. Words that use all of the seven letters in the honeycomb are known as “pangrams” and earn 7 bonus points (in addition to the points for the length of the word). So in the above example, MEGAPLEX is worth 15 points.*\n",
    "\n",
@ -28,27 +30,28 @@
    "\n",
    "# My Approach\n",
    "\n",
-    "Since the referenced [word list](https://norvig.com/ngrams/enable1.txt) came from *my* web site (I didn't make up the list; it is a standard Scrabble word list that I happen to host a copy of), I felt somewhat compelled to solve this one. \n",
+    "Since the referenced word list came from **my** web site (I didn't make up the list; it is a standard Scrabble word list that I happen to host a copy of), I felt somewhat compelled to solve this one. \n",
    "\n",
-    "This puzzle is  different from other word puzzles because it deals with *unordered sets* of letters, not *ordered permutations* of letters. That makes things easier. When I searched for an optimal 5 by 5 Boggle board, I couldn't exhaustively try all $26^{(5\\times 5)} \\approx 10^{35}$ possibilites; I could only do hillclimbing to find a local maximum. But for Spelling Bee, it is feasible to try every possibility and get a guaranteed highest-scoring honeycomb. Here's a sketch of my approach:\n",
+    "Other word puzzles are hard because there are so many possibilities to consider. \n",
-    " \n",
+    "But fortunately the honeycomb puzzle (unlike [Boggle](https://github.com/aimacode/aima-python/blob/master/search.py) or [Scrabble](Scrabble.ipynb)) deals with *unordered sets* of letters, not *ordered permutations* of letters. So, once we exclude the \"S\", there are only (25 choose 7) = 480,700 *sets* of seven letters to consider.  A brute force approach could evaluate all of them (probably over the course of multiple hours). \n",
    "- Since order and repetition don't count, we can represent a word as a **set** of letters, which I will call a `letterset`. For simplicity I'll choose to implement that as a sorted string (not as a Python `set` or `frozenset`). For example:\n",
    "      letterset(\"GLAM\") == letterset(\"AMALGAM\") == \"AGLM\"\n",
    "- A word is a **pangram** if and only if its letterset has exactly 7 letters.\n",
    "- A honeycomb can be represented by a `(letterset, center)` pair, for example `('AEGLMPX', 'G')` for the honeycomb above.\n",
    "- Since the rules say every valid honeycomb must contain a pangram, it must be that case that every valid honeycomb *is* a pangram. That means:\n",
    "  * The number of valid honeycombs is 7 times the number of pangram lettersets (because any of the 7 letters could be the center).\n",
    "  * I will consider every valid honeycomb and compute the game score for each one.\n",
    "  * The one with the highest game score is guaranteed to be the optimal honeycomb.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Words, Word Scores, Pangrams, and Lettersets\n",
    "\n",
-    "I'll start by importing some utilities and defining four basic functions about words:"
+    "Fortunately, I noticed a better trick. The rules say that every valid honeycomb must contain a pangram. Therefore, it must be the case that every valid honeycomb **is** a pangram. How many pangrams could there be in the word list&mdash;maybe 10,000?  It must be a lot less than the number of sets of 7 letters.\n",
    "\n",
    "So here's a broad sketch of my approach:\n",
    "\n",
    "- The **best honeycomb** is the one with the highest game score among all candidate honeycombs.\n",
    "- A **candidate honeycomb** is any set of 7 letters that constitute a pangram word in the word list, with any one of the 7 letters as the center.\n",
    "- A **pangram word** is a word with exactly 7 distinct letters.\n",
    "- The **game score** for a honeycomb is the sum of the word scores for all the words that the honeycomb can make.\n",
    "- The **word score** of a word is 1 for four-letter words, or else $n$ for $n$-letters plus a 7-point bonus for pangrams.\n",
    "- A honeycomb **can make** a word if all the letters in the word are in the honeycomb, and the word contains the center letter.\n",
    "- The **set of letters** in a word (or honeycomb) can be represented as a sorted string of distinct letters (e.g., the set of letters in \"AMALGAM\" is \"AGLM\"). \n",
    "- A **honeycomb** is defined by two things, the set of seven letters, and the distinguished single center letter.\n",
    "- The **word list** can ignore words that: are less than 4 letters long; have an S; or have more than 7 distinct letters.\n",
    "\n",
    "(Note: I could have used a `frozenset` to represent a set of letters, but a sorted string seemed simpler, and for debugging purposes, I'd rather be looking at  `'AEGLMPX'` than at `frozenset({'A', 'E', 'G', 'L', 'M', 'P', 'X'})`).\n",
    "\n",
    "Each of these concepts can be implemented in a couple lines of code:"
   ]
  },
  {
@ -57,45 +60,58 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "from collections import Counter, defaultdict\n",
+    "def best_honeycomb(words) -> tuple: \n",
-    "from itertools import combinations"
+    "    \"\"\"Return (score, honeycomb) for the honeycomb with highest game score on these words.\"\"\"\n",
-   ]
+    "    return max((game_score(h, words), h) for h in candidate_honeycombs(words))\n",
-  },
+    "\n",
-  {
+    "def candidate_honeycombs(words):\n",
-   "cell_type": "code",
+    "    \"\"\"The pangram lettersets, each with all 7 centers.\"\"\"\n",
-   "execution_count": 2,
+    "    pangrams = {letterset(w) for w in words if is_pangram(w)}\n",
-   "metadata": {},
+    "    return (Honeycomb(pangram, center) for pangram in pangrams for center in pangram)\n",
-   "outputs": [],
+    "\n",
-   "source": [
+    "def is_pangram(word) -> bool: \n",
-    "def Words(text) -> list:\n",
+    "    \"\"\"Does a word have exactly 7 distinct letters?\"\"\"\n",
-    "    \"\"\"A list of all the valid space-separated words.\"\"\"\n",
+    "    return len(set(word)) == 7\n",
-    "    return [w for w in text.upper().split() \n",
+    "\n",
-    "            if len(w) >= 4 and 'S' not in w and len(set(w)) <= 7]\n",
+    "def game_score(honeycomb, words) -> int:\n",
    "    \"\"\"The total score for this honeycomb; the sum of the word scores.\"\"\"\n",
    "    return sum(word_score(word) for word in words if can_make(honeycomb, word))\n",
    "\n",
    "def word_score(word) -> int: \n",
    "    \"\"\"The points for this word, including bonus for pangram.\"\"\"\n",
    "    bonus = (7 if is_pangram(word) else 0)\n",
    "    return (1 if len(word) == 4 else len(word) + bonus)\n",
    "\n",
-    "def is_pangram(word) -> bool: \n",
+    "def can_make(honeycomb, word) -> bool:\n",
-    "    \"\"\"Does a word use all 7 letters (some maybe more than once)?\"\"\"\n",
+    "    \"\"\"Can the honeycomb make this word?\"\"\"\n",
-    "    return len(set(word)) == 7\n",
+    "    (letters, center) = honeycomb\n",
    "    return center in word and all(L in letters for L in word)\n",
    "\n",
    "def letterset(word) -> str:\n",
-    "    \"\"\"The set of letters in a word, represented as a sorted str.\"\"\"\n",
+    "    \"\"\"The set of letters in a word, as a sorted string.\n",
-    "    return ''.join(sorted(set(word)))"
+    "    For example, letterset('GLAM') == letterset('AMALGAM') == 'AGLM'.\"\"\"\n",
    "    return ''.join(sorted(set(word)))\n",
    "\n",
    "def Honeycomb(letters, center) -> tuple: return (letters, center)\n",
    "\n",
    "def wordlist(text) -> list:\n",
    "    \"\"\"A list of all the valid whitespace-separated words in text.\"\"\"\n",
    "    return [w for w in text.upper().split() \n",
    "            if len(w) >= 4 and 'S' not in w and len(set(w)) <= 7]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "I'll make a tiny word list to experiment with: "
+    "# Experimentation and Small Test\n",
    "\n",
    "I'll make a tiny word list and start experimenting with it:"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 3,
+   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
@ -104,13 +120,13 @@
       "['AMALGAM', 'GAME', 'GLAM', 'MEGAPLEX', 'CACCIATORE', 'EROTICA']"
      ]
     },
-     "execution_count": 3,
+     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
-    "words = Words('amalgam amalgamation game games gem glam megaplex cacciatore erotica I me')\n",
+    "words = wordlist('amalgam amalgamation game games gem glam megaplex cacciatore erotica I me')\n",
    "words"
   ]
  },
@ -123,7 +139,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 4,
+   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
@ -137,7 +153,7 @@
       " 'EROTICA': 14}"
      ]
     },
-     "execution_count": 4,
+     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
@ -148,7 +164,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 5,
+   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
@ -157,7 +173,7 @@
       "{'CACCIATORE', 'EROTICA', 'MEGAPLEX'}"
      ]
     },
-     "execution_count": 5,
+     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
@ -168,7 +184,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 6,
+   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
@ -182,7 +198,7 @@
       " 'EROTICA': 'ACEIORT'}"
      ]
     },
-     "execution_count": 6,
+     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
@ -199,29 +215,32 @@
   ]
  },
  {
-   "cell_type": "markdown",
+   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
-    "# Game Score and Best Honeycomb\n",
+    "honeycomb = Honeycomb('AEGLMPX', 'G')"
    "\n",
    "The game score for a honeycomb is the sum of the word scores for all the words that the honeycomb can make. How do we know if a honeycomb can make a word? It can if (1) the word contains the honeycomb's center and (2) every letter in the word is in the honeycomb. Another way of saying (2) is that the letters in the word must be a **subset** of the letters in the honeycomb.\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
-   "outputs": [],
+   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'AMALGAM': 7, 'GAME': 1, 'GLAM': 1, 'MEGAPLEX': 15}"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
-    "def game_score(honeycomb, words) -> int:\n",
+    "{w: word_score(w) for w in words if can_make(honeycomb, w)}"
    "    \"\"\"The total score for this honeycomb.\"\"\"\n",
    "    return sum(word_score(word) for word in words if can_make(honeycomb, word))\n",
    "\n",
    "def can_make(honeycomb, word) -> bool:\n",
    "    \"\"\"Can the honeycomb make this word?\"\"\"\n",
    "    (letters, center) = honeycomb\n",
    "    return center in word and all(L in letters for L in word)"
   ]
  },
  {
@ -241,46 +260,21 @@
    }
   ],
   "source": [
    "honeycomb = ('AEGLMPX', 'G')\n",
    "\n",
    "game_score(honeycomb, words)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "I can find the highest-scoring honeycomb by considering all valid honeycombs: ones where the letters are a pangram letterset, and the center can be any of the 7 letters. Then I just need to pick out the honeycomb with the maximum game score:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "def best_honeycomb(words) -> list: \n",
    "    \"\"\"Return [score, honeycomb] for the honeycomb with highest score on these words.\"\"\"\n",
    "    return max([game_score(h, words), h] for h in valid_honeycombs(words))\n",
    "\n",
    "def valid_honeycombs(words):\n",
    "    \"The valid honeycombs are the pangram lettersets, each with all 7 centers.\"\n",
    "    pangrams = {letterset(w) for w in words if is_pangram(w)}\n",
    "    return ((pangram, center) for pangram in pangrams for center in pangram)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
-       "[31, ('ACEIORT', 'T')]"
+       "(31, ('ACEIORT', 'T'))"
      ]
     },
-     "execution_count": 10,
+     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
@ -293,16 +287,14 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "**We're done!** We know how to find the best honeycomb. But so far, we've only done it  for the tiny word list. \n",
+    "**We're done!** We know how to find the best honeycomb. But so far, we've only done it  for the tiny word list. Let's look at the real word list.\n",
    "\n",
-    "# The enable1 Word List\n",
+    "# The enable1 Word List\n"
    "\n",
    "Here's the real word list:"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 11,
+   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
@ -320,7 +312,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 12,
+   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
@ -329,19 +321,19 @@
       "44585"
      ]
     },
-     "execution_count": 12,
+     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
-    "enable1 = Words(open('enable1.txt').read())\n",
+    "enable1 = wordlist(open('enable1.txt').read())\n",
    "len(enable1)"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 13,
+   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
@ -350,7 +342,7 @@
       "14741"
      ]
     },
-     "execution_count": 13,
+     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
@ -361,25 +353,72 @@
   ]
  },
  {
-   "cell_type": "markdown",
+   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "7986"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
-    "So: we start with 172,820 words in the enable1 word list, reduce that to 44,585 valid Spelling Bee words, and find that 14,741 of those words are pangrams. \n",
+    "pangram_sets = {letterset(w) for w in pangrams}\n",
-    "\n",
+    "len(pangram_sets)"
    "How long will it take to run `best_honeycomb(enable1)`? Let's estimate by checking how long it takes to compute the game score of a single honeycomb:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "55902"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "_ * 7"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "So to recap on the number of words of various types in enable1:\n",
    "\n",
    "    172,820 total words\n",
    "     44,585 valid words (eliminating \"S\" words, short words, 8+ letter words)\n",
    "     14,741 pangram words\n",
    "      7,986 unique pangram lettersets\n",
    "     55,902 candidate honeycombs\n",
    "\n",
    "How long will it take to run `best_honeycomb(enable1)`? Let's estimate by checking how long it takes to compute the game score of a single honeycomb:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
-      "CPU times: user 11.9 ms, sys: 391 µs, total: 12.3 ms\n",
+      "CPU times: user 10.5 ms, sys: 286 µs, total: 10.8 ms\n",
-      "Wall time: 12 ms\n"
+      "Wall time: 10.8 ms\n"
     ]
    },
    {
@ -388,7 +427,7 @@
       "153"
      ]
     },
-     "execution_count": 14,
+     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
@ -401,52 +440,24 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "About 12 milliseconds. How many minutes would that be for all 14,741 × 7 valid honeycombs?"
+    "That's to compute one `game_score`. Multiply by 55,902 candidate honeycombs and we get somewhere in the 10 minute range. I could run `best_honeycomb(enable1)` right now and take a coffee break until it completes, but I'm predisposed to think that a puzzle like this deserves a more elegant solution. I know that [Project Euler](https://projecteuler.net/) designs their puzzles so that a good solution runs in less than a minute, so I'll make that my goal here.\n",
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "20.6374"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    ".012 * 14741 * 7 / 60"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "About 20 minutes. I could run `best_honeycomb(enable1)` right now and take a coffee break until it completes, but I'm predisposed to think that a puzzle like this deserves a more elegant solution. I'd like to get the run time under a minute (as in [Project Euler](https://projecteuler.net/)).\n",
    "\n",
    "# Making it Faster\n",
    "\n",
-    "Here's my plan for a more efficient program:\n",
+    "Here's how I think about making a more efficient program:\n",
    "\n",
-    "1. Keep the same strategy of trying every pangram, but do some precomputation that will make `game_score` much faster.\n",
+    "- We're doing a `game_score` for each of the 55,902  `candidate_honeycombs`. \n",
-    "1. The precomputation is: compute the `letterset` and `word_score` for each word, and make a table of `{letterset: points}` giving the total number of points that can be made with each letterset. I call this a `points_table`.\n",
+    "- `game_score` has to **look at each word in the wordlist, and test if it is a subset of the honeycomb.**\n",
-    "3. These calculations are independent of the honeycomb, so they need to be done only once, not 14,741 × 7  times. \n",
+    "- We can speed things up by flipping the test around: **look at each letter subset of the honeycomb, and test if it is in the word list.**\n",
-    "4. Within `game_score`, generate every valid **subset** of the letters in the honeycomb. A valid subset must include the center letter, and it may or may not include each of the other 6 letters, so there are exactly $2^6 = 64$ subsets. The function `letter_subsets(honeycomb)` returns these.\n",
+    "- By **letter subset** I mean a letter set containing a subset of the letters in the honeycomb, and definitely containing the center. So, for  `Honeycomb('ACEIORT', 'T')` the letter subsets are `['T', 'AT', 'CT', 'ET', 'IT', 'OT', 'RT', 'ACT', 'AET', ...]`\n",
-    "5. To compute `game_score`, just take the sum of the 64 subset entries in the points table.\n",
+    "- Why will flipping the test be faster? Because there are 44,585 words in the word list and only 64 letter subsets of a honeycomb. (A subset must include the center letter, and it may or may not include each of the other 6 letters, so there are exactly $2^6 = 64$ letter subsets of each pangram.)\n",
    "- We're left with the problem of deciding if a letter subset is a word. In fact, a letter subset might correspond to multiple words (e.g. `'AGLM'` corresponds to both `GLAM` and `AMALGAM`). \n",
    "- Ultimately we're more interested in the total number of points that a letter subset corresponds to, not in the individual word(s).\n",
    "- So I will create a table of `{letter_subset: total_points}` giving the total number of word score points for all the words that correspond to the letter subset. I call this a `points_table`.\n",
    "- Since the points table is independent of any honeycomb, I can compute it once and for all; I don't need to recompute it for each honeycomb.\n",
    "- To compute `game_score`, just take the sum of the 64 letter subset entries in the points table.\n",
    "\n",
-    "\n",
+    "Here's the code. Notice I didn't want to redefine the global function `game_score` with a different signature, so instead I made it be a local function that references the local `pts_table`,"
    "That means that in `game_score` we no longer need to iterate over 44,585 words and check if each word is a subset of the honeycomb. Instead we iterate over the 64 subsets of the honeycomb and check if each one is a word (or more than word) and how many total points those word(s) score. \n",
    "\n",
    "Since 64 &lt; 44,585, that's a nice optimization!\n",
    "\n",
    "\n",
    "Here's the code. Notice we've changed the interface to `game_score`; it now takes a points table, not a word list. So beware if you are jumping around in this notebook and re-executing previous cells."
   ]
  },
  {
@ -455,13 +466,15 @@
   "metadata": {},
   "outputs": [],
   "source": [
    "from collections import Counter, defaultdict\n",
    "from itertools import combinations\n",
    "\n",
    "def best_honeycomb(words) -> tuple: \n",
    "    \"\"\"Return (score, honeycomb) for the honeycomb with highest score on these words.\"\"\"\n",
    "    pts_table = points_table(words)\n",
-    "    pangrams = [s for s in pts_table if len(s) == 7]\n",
+    "    def game_score(honeycomb) -> int: \n",
-    "    honeycombs = ((pangram, center) for pangram in pangrams for center in pangram)\n",
+    "        return sum(pts_table[s] for s in letter_subsets(honeycomb))\n",
-    "    return max([game_score(h, pts_table), h]\n",
+    "    return max((game_score(h), h) for h in candidate_honeycombs(words))\n",
    "               for h in honeycombs)\n",
    "\n",
    "def points_table(words) -> dict:\n",
    "    \"\"\"Return a dict of {letterset: points} from words.\"\"\"\n",
@ -471,16 +484,12 @@
    "    return table\n",
    "\n",
    "def letter_subsets(honeycomb) -> list:\n",
-    "    \"\"\"The 64 subsets of the letters in the honeycomb that contain the center letter.\"\"\"\n",
+    "    \"\"\"The 64 subsets of the letters in the honeycomb (that must contain the center letter).\"\"\"\n",
    "    (letters, center) = honeycomb\n",
    "    return [''.join(subset) \n",
    "            for n in range(1, 8) \n",
    "            for subset in combinations(letters, n)\n",
-    "            if center in subset]\n",
+    "            if center in subset]"
    "\n",
    "def game_score(honeycomb, pts_table) -> int:\n",
    "    \"\"\"The total score for this honeycomb, given a points_table.\"\"\"\n",
    "    return sum(pts_table[s] for s in letter_subsets(honeycomb))"
   ]
  },
  {
@ -507,7 +516,7 @@
    }
   ],
   "source": [
-    "# A 4-letter honeycomb makes 2**3 = 8 subsets; 7-letter honeycombs make 64\n",
+    "# A 4-letter honeycomb makes 2**3 = 8 subsets; 7-letter honeycombs make 2**7 == 64\n",
    "letter_subsets(('ABCD', 'C')) "
   ]
  },
@ -562,9 +571,14 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "The letterset `'ACEIORT'` gets 31 points, 17 for CACCIATORE and 14 for EROTICA, and the letterset `'AGLM'` gets 8 points, 7 for AMALGAM and 1 for GLAM. The other lettersets represent one word each. \n",
+    "The letterset `'ACEIORT'` gets 31 points, 17 for CACCIATORE and 14 for EROTICA, and the letterset `'AGLM'` gets 8 points, 7 for AMALGAM and 1 for GLAM. The other lettersets represent one word each. "
-    "\n",
+   ]
-    "Let's make sure we haven't broken the `game_score` and `best_honeycomb` functions:"
+  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's test that `best_honeycomb(words)` gets the same answer as before, and that the points table has the same set of pangrams as before."
   ]
  },
  {
@ -573,8 +587,8 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "assert game_score(honeycomb, points_table(words)) == 24\n",
+    "assert best_honeycomb(words) == (31, ('ACEIORT', 'T'))\n",
-    "assert best_honeycomb(words) == [31, ('ACEIORT', 'T')]"
+    "assert pangram_sets == {s for s in points_table(enable1) if len(s) == 7}"
   ]
  },
  {
@ -600,14 +614,14 @@
     "name": "stdout",
     "output_type": "stream",
     "text": [
-      "CPU times: user 1.99 s, sys: 4.75 ms, total: 2 s\n",
+      "CPU times: user 1.84 s, sys: 4.03 ms, total: 1.84 s\n",
-      "Wall time: 2 s\n"
+      "Wall time: 1.85 s\n"
     ]
    },
    {
     "data": {
      "text/plain": [
-       "[3898, ('AEGINRT', 'R')]"
+       "(3898, ('AEGINRT', 'R'))"
      ]
     },
     "execution_count": 21,
@ -627,6 +641,77 @@
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Making it Even Fasterer\n",
    "\n",
    "OK, that was 30 times faster than my goal of one minute. It was a nice optimization to look at only 64 letter subsets rather than 44,585 words. But I'm still looking at 103,187 honeycombs, and I feel that some of them are a waste of time.  Consider the pangram \"JUKEBOX\". With the uncommon letters J, K, and X, it does not look like a high-scoring honeycomb, no matter what center we choose. So why waste time trying all seven centers? Here's the outline of a faster `best_honeycomb`:\n",
    "\n",
    "- Go through the pangrams as before\n",
    "- However, always keep track of the best score and the best honeycomb that we have found so far.\n",
    "- For each new pangram, first see how many  points it would score if we ignore the restrriction that a particular center letter must be used. (I compute that with `game_score('')`, where again `game_score` is a local function,\n",
    "this time with access to both `pts_table` and `subsets`.)\n",
    "- Only if `game_score('')` is better than the best score found so far, then evaluate `game_score(C)` for each of the seven possible centers `C`.\n",
    "- In the end, return the best score and the best honeycomb."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "CPU times: user 439 ms, sys: 1.93 ms, total: 441 ms\n",
      "Wall time: 441 ms\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "(3898, ('AEGINRT', 'R'))"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "def best_honeycomb(words) -> tuple: \n",
    "    \"\"\"Return (score, honeycomb) for the honeycomb with highest score on these words.\"\"\"\n",
    "    best_score, best_honeycomb = 0, None\n",
    "    pts_table = points_table(words)\n",
    "    pangrams = (s for s in pts_table if len(s) == 7)\n",
    "    for pangram in pangrams:\n",
    "        subsets = string_subsets(pangram)\n",
    "        def game_score(center): return sum(pts_table[s] for s in subsets if center in s)\n",
    "        if game_score('') > best_score:\n",
    "            for C in pangram:\n",
    "                if game_score(C) > best_score:\n",
    "                    best_score, best_honeycomb = game_score(C), Honeycomb(pangram, C)\n",
    "    return (best_score, best_honeycomb)\n",
    "\n",
    "def string_subsets(letters) -> list:\n",
    "    \"\"\"All subsets of a string.\"\"\"\n",
    "    return [''.join(s) \n",
    "            for n in range(len(letters) + 1) \n",
    "            for s in combinations(letters, n)]\n",
    "\n",
    "%time best_honeycomb(enable1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Looking good! We get the same answer, and in about half a second, four times faster than before.  "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
@ -640,7 +725,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 22,
+   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
@ -649,7 +734,7 @@
       "'ANTITOTALITARIAN'"
      ]
     },
-     "execution_count": 22,
+     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
@ -667,7 +752,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 23,
+   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
@ -690,7 +775,7 @@
       " 'UTOPIAN']"
      ]
     },
-     "execution_count": 23,
+     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
@ -708,7 +793,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 24,
+   "execution_count": 25,
   "metadata": {},
   "outputs": [
    {
@ -717,7 +802,7 @@
       "[('S', 103913), ('valid', 44585), ('>7', 23400), ('<4', 922)]"
      ]
     },
-     "execution_count": 24,
+     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
@ -733,12 +818,12 @@
   "source": [
    "There are more than twice as many words with an 'S' as there are valid words.\n",
    "\n",
-    "About the `points_table`: How many different letter subsets are there? Which ones score the most? The least?"
+    "About the `points_table`: How many different letter subsets are there?  "
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 25,
+   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
@ -747,7 +832,7 @@
       "21661"
      ]
     },
-     "execution_count": 25,
+     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
@ -761,12 +846,14 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "That means there's about two valid words for each letterset."
+    "That means there's about two valid words for each letterset.\n",
    "\n",
    "Which lettersets score the most? The least?"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 26,
+   "execution_count": 27,
   "metadata": {},
   "outputs": [
    {
@ -794,7 +881,7 @@
       " ('ACELORT', 241)]"
      ]
     },
-     "execution_count": 26,
+     "execution_count": 27,
     "metadata": {},
     "output_type": "execute_result"
    }
@ -805,7 +892,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 27,
+   "execution_count": 28,
   "metadata": {},
   "outputs": [
    {
@ -833,7 +920,7 @@
       " ('EMYZ', 1)]"
      ]
     },
-     "execution_count": 27,
+     "execution_count": 28,
     "metadata": {},
     "output_type": "execute_result"
    }
@ -853,7 +940,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 28,
+   "execution_count": 29,
   "metadata": {
    "scrolled": false
   },
@ -883,9 +970,9 @@
    "            print(fill(' '.join(words), width=80,\n",
    "                       initial_indent='    ', subsequent_indent='    '))\n",
    "            \n",
-    "def Ns(n, things):\n",
+    "def Ns(n, thing, plural=None):\n",
    "    \"\"\"Ns(3, 'bear') => '3 bears'; Ns(1, 'world') => '1 world'\"\"\"  \n",
-    "    return f\"{n:,d} {things}{'' if n == 1 else 's'}\"\n",
+    "    return f\"{n:,d} {thing if n == 1 else plurtal}\"\n",
    "\n",
    "def group_by(items, key):\n",
    "    \"Group items into bins of a dict, each bin keyed by key(item).\"\n",
@ -897,7 +984,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 29,
+   "execution_count": 30,
   "metadata": {},
   "outputs": [
    {
@ -923,7 +1010,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 30,
+   "execution_count": 31,
   "metadata": {
    "scrolled": false
   },
@ -1108,7 +1195,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 31,
+   "execution_count": 32,
   "metadata": {},
   "outputs": [],
   "source": [
@ -1120,7 +1207,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 32,
+   "execution_count": 33,
   "metadata": {
    "scrolled": false
   },
@ -1437,7 +1524,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.7.2"
+   "version": "3.7.0"
  }
 },
 "nbformat": 4,