{ "cells": [ { "cell_type": "markdown", "id": "9e54c615-5dab-4642-924f-f2840558ff9e", "metadata": {}, "source": [ "
Peter Norvig
April 2026
\n", "\n", "# Did you solve it? R y clvr ngh t rd ths sntnc?\n", "\n", "Alex Bellos's [30 March 2026 column](https://www.theguardian.com/science/2026/mar/30/did-you-solve-it-r-y-clvr-ngh-t-rd-ths-sntnc) asks us to guess famous phrases or sayings, given the shapes of the rectangles that bound the letters, and the clue that vowels are **green** and consonants are **blue**. \n", "\n", "Here's one of the puzzles:\n", "\n", "![](https://i.guim.co.uk/img/media/dd62fe8dfdc6eb9d98815a2e14791ae268aa4d46/0_0_580_75/master/580.jpg?width=310&dpr=2&s=none&crop=none)" ] }, { "cell_type": "markdown", "id": "6a2c1014-5894-4bf0-b4b3-5f1d1561fa2a", "metadata": {}, "source": [ "I can help solve this problem by using code to constrain what each letter and each word might be.\n", "\n", "I'll start by defining different subsets of letters: `v` for the vowels, `c` for the consonants, `a` for the letters whose shape ascends above the norm, etc.:" ] }, { "cell_type": "code", "execution_count": 24, "id": "628d38c1-0671-49dd-9897-7bd5eff196ee", "metadata": {}, "outputs": [], "source": [ "letters = set('abcdefghijklmnopqrstuvwxyz')\n", "v = set('aeiou') # vowels (green)\n", "c = letters - v # consonants (blue)\n", "\n", "a = set('bdfhijlt') # ascending\n", "d = set('gjpqy') # descending\n", "b = letters - a - d # block: neither ascending nor descending\n", "\n", "t = set('li') # thin\n", "w = set('mw') # wide\n", "n = letters - t - w # normal: neither wide nor thin" ] }, { "cell_type": "markdown", "id": "f51d68da-7569-4bdd-97a2-3c245e143c70", "metadata": {}, "source": [ "Now I can say that the first letter of the first word above (the green rectangle) is a block-shaped vowel; the intersection of the **b**lock and **v**owel sets, denoted in Python with the `&` operator:" ] }, { "cell_type": "code", "execution_count": 2, "id": "1f17f9ce-0e88-464d-80e6-53a458b3594b", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'a', 'e', 'o', 'u'}" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "b&v" ] }, { "cell_type": "markdown", "id": "de4886ea-0244-4582-8a43-9fae9daceb9a", "metadata": {}, "source": [ "The first word has three letters, `b&v` followed by two **a**scending **t**hin **c**onsonants, `a&t&c`:" ] }, { "cell_type": "code", "execution_count": 3, "id": "df3a9acf-e4df-42eb-8eb0-61fbdda7b598", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[{'a', 'e', 'o', 'u'}, {'l'}, {'l'}]" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[b&v, a&t&c, a&t&c]" ] }, { "cell_type": "markdown", "id": "b3945fa3-587c-4155-b201-ed7aa55510e5", "metadata": {}, "source": [ "Neat! There is only one **a**scending **t**hin **c**onsonant, `'l'`.\n", "\n", "The whole puzzle is as follows (I made the apostrophe be a word of its own):" ] }, { "cell_type": "code", "execution_count": 4, "id": "b4b5bb61-08f6-4815-9f46-ff1dd8e5816c", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[[{'a', 'e', 'o', 'u'}, {'l'}, {'l'}],\n", " [{'b', 'd', 'f', 'h', 'j', 'l', 't'},\n", " {'b', 'd', 'f', 'h', 'j', 'l', 't'},\n", " {'a', 'e', 'o', 'u'}],\n", " [{'m', 'w'},\n", " {'a', 'e', 'o', 'u'},\n", " {'c', 'k', 'm', 'n', 'r', 's', 'v', 'w', 'x', 'z'},\n", " {'b', 'd', 'f', 'h', 'j', 'l', 't'},\n", " {'b', 'd', 'f', 'h', 'j', 'l', 't'}],\n", " [{'’'}],\n", " [{'c', 'k', 'm', 'n', 'r', 's', 'v', 'w', 'x', 'z'}],\n", " [{'a', 'e', 'o', 'u'}],\n", " [{'c', 'k', 'm', 'n', 'r', 's', 'v', 'w', 'x', 'z'},\n", " {'b', 'd', 'f', 'h', 'j', 'l', 't'},\n", " {'a', 'e', 'o', 'u'},\n", " {'g', 'j', 'p', 'q', 'y'},\n", " {'a', 'e', 'o', 'u'}]]" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "puzzle3 = [[b&v, a&t&c, a&t&c], [a&c, a&c, b&v], [w&c, b&v, b&c, a&c, a&c], [{\"’\"}], [b&c], [b&v], [b&c, a&c, b&v, d&c, b&v]]\n", "puzzle3" ] }, { "cell_type": "markdown", "id": "50586a9e-2a63-4c43-858e-509e783aa99f", "metadata": {}, "source": [ "## Possible Words\n", "\n", "What combinations of letters can each word pattern make? The function `possible_words` goes through the pattern one letter set at a time and builds up all possible ways of adding each possible letter to each possible partial word string:" ] }, { "cell_type": "code", "execution_count": 5, "id": "664292e4-416c-4408-bc9a-a5df0f2ede08", "metadata": {}, "outputs": [], "source": [ "def possible_words(pattern: list[set[str]]) -> set[str]:\n", " \"\"\"All ways of choosing one letter from each of the possible letters in the word pattern.\"\"\"\n", " words = {''} # To start there is one possible partial word, with no letters\n", " for letter_set in pattern:\n", " # On each turn, add each possible letter to each possible partial word\n", " words = {word + letter for word in words for letter in letter_set}\n", " return words" ] }, { "cell_type": "markdown", "id": "c3de036e-7ca3-45f5-99ca-ead670ced8a0", "metadata": {}, "source": [ "For example," ] }, { "cell_type": "code", "execution_count": 6, "id": "2743ba3a-0474-4f04-91cf-7ea57550b74c", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'all', 'ell', 'oll', 'ull'}" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "possible_words([b&v, a&t&c, a&t&c])" ] }, { "cell_type": "markdown", "id": "d213da56-3d0b-4344-ab09-6a676b11a1f2", "metadata": {}, "source": [ "Another example:" ] }, { "cell_type": "code", "execution_count": 7, "id": "0a8cb3da-3d3d-4241-9d22-8b50c16aeb8d", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'ban', 'bat', 'bon', 'bot', 'can', 'cat', 'con', 'cot'}" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "possible_words([{'b', 'c'}, {'a', 'o'}, {'n', 't'}])" ] }, { "cell_type": "markdown", "id": "799abea2-83a4-4d71-bd04-270ff056249a", "metadata": {}, "source": [ "Let's trace through how `possible_words` works on this example. It starts with one possible partial word, the empty string:\n", "\n", " words = {''}\n", "\n", "Then it enterts the `for` loop and looks at the first letter set, `{'b', 'c'}`, and adds each letter to each partial word (just one of them: the empty string) to get a set of two partial words:\n", "\n", " words = {'b', 'c'}\n", "\n", "It does the same thing with the second letter set, `{'a', 'o'}`, to get a set of four partial words:\n", "\n", " words = {'ba', 'bo', 'ca', 'co'}\n", "\n", "Finally, it considers the third letter set, `{'n', 't'}`, and gets the final answer, a set of eight words:\n", "\n", " words = {'ban', 'bat', 'bon', 'bot', 'can', 'cat', 'con', 'cot'}\n", "\n", "## Dictionary Words\n", "\n", "What actual dictionary words could a pattern stand for? To answer that I'll need a list of actual dictionary words. Furthermore, when `possible_words(pattern)` returns more than one word, I'll have to pick one. To facilitate that, I will use a word list that includes the frequency of each word so that I can pick the word with the highest frequency. \n", "\n", "I'll download the word list file \"[count_big.txt](count_big.txt)\" which has the format: \n", "\n", " a 21160\n", " aah 1\n", " aaron 5\n", " ab 2\n", " aback 3\n", " abacus 1\n", " abandon 32\n", " abandoned 72\n", " abandoning 27\n", " abandonment 15\n", "\n", "A few things to note about the following command (but you don't need to memorize):\n", "- The `!` at the start of a line means to do an operating system command, not a Python command.\n", "- The `[ -e count_big.txt ] ||` part says to skip downloading the file if it already exists.\n", "- The `curl` command downloads the file\n" ] }, { "cell_type": "code", "execution_count": 8, "id": "4b0b245b-5085-492e-b3c3-902f2d8ae723", "metadata": {}, "outputs": [], "source": [ "! [ -e count_big.txt ] || curl -O https://norvig.com/ngrams/count_big.txt" ] }, { "cell_type": "markdown", "id": "65006749-5475-434c-a6bb-aab8c81e6e0b", "metadata": {}, "source": [ "I'll read the contents of the file into a Python dictionary, which will have the form `{'a': 21160, 'aah': 1, ...}`. " ] }, { "cell_type": "code", "execution_count": 9, "id": "c863b2e8-4e81-4c13-bd62-2aa7be38ad92", "metadata": {}, "outputs": [], "source": [ "def make_dictionary(lines) -> dict:\n", " \"\"\"The lines are strings with a word and a frequency count; make it into a dict.\"\"\"\n", " counts = {} # Start with an emoty dict\n", " for line in lines:\n", " word, count = line.split()\n", " counts[word] = int(count)\n", " return counts\n", "\n", "dictionary = make_dictionary(open('count_big.txt'))" ] }, { "cell_type": "markdown", "id": "a5939c3b-b99a-4e8d-9cf5-9f4e8eb1533f", "metadata": {}, "source": [ "Here are some things you can do with the dictionary:" ] }, { "cell_type": "code", "execution_count": 10, "id": "9edff66d-8668-4408-b636-a2aec0614fcb", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "'aback' in dictionary" ] }, { "cell_type": "code", "execution_count": 11, "id": "cae82016-a5d1-4fac-9e4d-75029e506f67", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "'the' in dictionary" ] }, { "cell_type": "code", "execution_count": 12, "id": "8bf5eb96-7502-4568-bba0-71313f481e99", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "False" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "'xyzzy!@#$' in dictionary" ] }, { "cell_type": "code", "execution_count": 13, "id": "5cbaf8af-a4eb-4390-b0f2-0bb0e32426ca", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "80030" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dictionary['the'] # get the frequency count" ] }, { "cell_type": "code", "execution_count": 14, "id": "d1919007-f2e6-484a-a078-c9dd805a17c2", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "80030" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dictionary.get('the', 0) # Get the count, with a default of 0 if word is not in dictionary" ] }, { "cell_type": "code", "execution_count": 23, "id": "b137e272-9f34-4ecc-9131-934304a4f5b2", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dictionary.get('xyzzy!@#$', 0) # Get the count, with a default of 0 if word is not in dictionary" ] }, { "cell_type": "code", "execution_count": 15, "id": "5e10a770-0945-48c7-ad37-826f2e42a2b1", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "29136" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(dictionary) # the number of words in the dictionary" ] }, { "cell_type": "markdown", "id": "3662284a-b83e-4122-9896-831fc8580a87", "metadata": {}, "source": [ "Now I want to take a pattern and figure out the most likely word (which I will guess is the most frequent one):" ] }, { "cell_type": "code", "execution_count": 16, "id": "fea1a704-ab79-4204-bd89-506e66c12042", "metadata": {}, "outputs": [], "source": [ "def most_likely_word(pattern: list[set[str]]) -> str:\n", " \"\"\"Out of all the possible words the pattern can make, pick the most frequent one.\"\"\"\n", " return max(possible_words(pattern), key=frequency)\n", "\n", "def frequency(word) -> int: \n", " \"\"\"The frequency count of the word in the dictionary, or 0 by default.\"\"\"\n", " return dictionary.get(word, 0) " ] }, { "cell_type": "code", "execution_count": 17, "id": "a7cbb472-ce03-4239-b64f-ee1f8cbe02e3", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'all'" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "most_likely_word([b&v, a&t&c, a&t&c])" ] }, { "cell_type": "markdown", "id": "d78f5702-9064-4a50-822e-f6ad89a16497", "metadata": {}, "source": [ "So far, so good!\n", "\n", "A puzzle consists of a list of word patterns, and we can generate a best guess at solving the puzzle by finding the `most_likely_word` for each word pattern, and then joining them into a big string." ] }, { "cell_type": "code", "execution_count": 18, "id": "b6f70f1d-5f16-4d80-bb95-fc8b01ab57df", "metadata": {}, "outputs": [], "source": [ "def solve(puzzle: list[list[set[str]]]) -> str:\n", " \"\"\"Given a puzzle (a list of word patterns), return a string formed from the most likely matching words.\"\"\"\n", " return ' '.join(most_likely_word(pattern) for pattern in puzzle)" ] }, { "cell_type": "markdown", "id": "316692e0-244f-4715-bec9-f6e7babcd036", "metadata": {}, "source": [ "## Puzzle #3\n", "\n", "We're ready to see if our program can solve the puzzle:\n", "\n", "![](https://i.guim.co.uk/img/media/dd62fe8dfdc6eb9d98815a2e14791ae268aa4d46/0_0_580_75/master/580.jpg?width=310&dpr=2&s=none&crop=none)\n", "\n" ] }, { "cell_type": "code", "execution_count": 19, "id": "42054f45-a8af-4513-a5d0-f097b2acda02", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'all the world ’ s a stage'" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "solve(puzzle3) " ] }, { "cell_type": "markdown", "id": "8e3b7a2a-259c-463f-b542-5c095c559516", "metadata": {}, "source": [ "It worked! Let's do another one:\n", "\n", "## Puzzle #1\n", "\n", "![](https://i.guim.co.uk/img/media/a54abfc70c03cf9bc40e178c4e6186915b97ea1e/0_0_580_75/master/580.jpg?width=310&dpr=2&s=none&crop=none)" ] }, { "cell_type": "code", "execution_count": 20, "id": "688d1a5c-2770-4dd1-b8db-5d535949f909", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'all ’ s well that ends well'" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "solve([[b&v, a&t&c, a&t&c], [{\"’\"}], [b&c], [w&c, b&v, t&c, t&c], [a&c, a&c, b&v, a&c], [b&v, b&c, a&c, b&c], [w&c, b&v, t&c, t&c]])" ] }, { "cell_type": "markdown", "id": "47db30ff-a96a-47d7-a4a0-34fe29dad4db", "metadata": {}, "source": [ "That's correct!\n", "\n", "## Puzzle #8\n", "\n", " ![](https://i.guim.co.uk/img/media/3eba3f724bda55abf76c8bca57f645cd5a669aef/0_0_580_75/master/580.jpg?width=310&dpr=2&s=none&crop=none)" ] }, { "cell_type": "code", "execution_count": 21, "id": "599ece2f-88a2-4818-83f9-0d0e478986ab", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'all roads lead to some'" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "solve([[v, a&c, a&c], [b&n&c, b&n&v, b&n&v, a&c, b&c], [a&t&c, b&v, b&v, a&c],[a&c, b&v], [c, b&v, w&b&c, b&v]])" ] }, { "cell_type": "markdown", "id": "bfd1dc29-5958-4c51-a124-b39d633ea60f", "metadata": {}, "source": [ "OK, not quite right, but a good hint.\n", "\n", "One more:\n", "\n", "## Puzzle #10\n", "\n", "\n", "![](https://i.guim.co.uk/img/media/c358352e9f14f1fb0d0f458f034bb86777efcc83/0_0_464_75/master/464.jpg?width=310&dpr=2&s=none&crop=none)" ] }, { "cell_type": "code", "execution_count": 22, "id": "68d34678-7688-49c4-94b6-5ba207497866", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'have in blind'" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "solve([[a&c, b&v, b&c, b&v], [a&v, b&c], [a&c, a&t&c, a&v, b&c, a&c]])" ] }, { "cell_type": "markdown", "id": "87eb56d5-9a6b-41f0-8eec-b4c8f63dbdbd", "metadata": {}, "source": [ "That's not right either, but again it is a good clue to the right answer. (One reason I didn't get this one right is that I didn't consider capital letters, and a Capital \"L\" has a different shape than a lowercase \"l\".)" ] }, { "cell_type": "code", "execution_count": null, "id": "747c3d08-c98a-497d-9430-f6a69dbd70cc", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python [conda env:base] *", "language": "python", "name": "conda-base-py" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.13.9" } }, "nbformat": 4, "nbformat_minor": 5 }