Update Portmantout.ipynb

This commit is contained in:
Peter Norvig 2021-06-05 23:01:50 -07:00 committed by GitHub
parent a50b98856e
commit 60168bce8c
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -37,7 +37,7 @@
"\n", "\n",
"There's actually a third type of word, but it doesn't need a corresponding type of step: a **subword** is a word that is a substring of another word. If, say, `ajar` is in $W$, then we know we have to place it in some step along the path. But if `jar` is also in $W$, we don't need a separate step for it—whenever we place `ajar`, we have automatically placed `jar`. The program can save computation time by initializing the set of **unused words** to be just the **nonsubwords** in $W$. (We are still free to use a subword as a **bridging word** if needed.)\n", "There's actually a third type of word, but it doesn't need a corresponding type of step: a **subword** is a word that is a substring of another word. If, say, `ajar` is in $W$, then we know we have to place it in some step along the path. But if `jar` is also in $W$, we don't need a separate step for it—whenever we place `ajar`, we have automatically placed `jar`. The program can save computation time by initializing the set of **unused words** to be just the **nonsubwords** in $W$. (We are still free to use a subword as a **bridging word** if needed.)\n",
"\n", "\n",
"(*English trivia:* I use the clumsy term \"nonsubwords\" rather than \"superwords\", because there are a small number of words, like \"cozy\" and \"pugs\" and \"july,\" that are not subwords and also not superwords.)\n", "(*English trivia:* I use the clumsy term \"nonsubwords\" rather than \"superwords\", because there are a small number of words, like \"cozy\" and \"july,\" that are not subwords and also not superwords.)\n",
"\n", "\n",
"If we're going to be **greedy**, we need to know what we're being greedy about. I will always choose a step that minimizes the **number of excess letters** that the step adds. To be precise: I arbitrarily assume a baseline model in which all the words are concatenated with no repeated words and no overlap between them. (That's not a valid solution, but it is useful as a benchmark.) So if I add an unused word, and overlap it with the previous word by three letters, that is an excess of -3 (a negative excess is a positive thing): I've saved three letters over just concatenating the unused word. For a bridging word, the excess is the number of letters that do not overlap either the previous word or the next word. \n", "If we're going to be **greedy**, we need to know what we're being greedy about. I will always choose a step that minimizes the **number of excess letters** that the step adds. To be precise: I arbitrarily assume a baseline model in which all the words are concatenated with no repeated words and no overlap between them. (That's not a valid solution, but it is useful as a benchmark.) So if I add an unused word, and overlap it with the previous word by three letters, that is an excess of -3 (a negative excess is a positive thing): I've saved three letters over just concatenating the unused word. For a bridging word, the excess is the number of letters that do not overlap either the previous word or the next word. \n",
"\n", "\n",