diff --git a/ipynb/jotto.ipynb b/ipynb/jotto.ipynb
index dbf570c..b95d479 100644
--- a/ipynb/jotto.ipynb
+++ b/ipynb/jotto.ipynb
@@ -306,8 +306,8 @@
"metadata": {},
"outputs": [],
"source": [
- "def evaluate(scores: Iterable[Score]) -> None:\n",
- " \"\"\"Display statistics and a histogram for these scores.\"\"\"\n",
+ "def report(scores: Iterable[Score]) -> None:\n",
+ " \"\"\"Report statistics and a histogram for these scores.\"\"\"\n",
" scores = list(scores)\n",
" ctr = Counter(scores)\n",
" bins = range(min(ctr), max(ctr) + 2)\n",
@@ -359,7 +359,7 @@
}
],
"source": [
- "evaluate(play(random_guesser, target, verbose=False) for target in wordlist)"
+ "report(play(random_guesser, target, verbose=False) for target in wordlist)"
]
},
{
@@ -1112,7 +1112,7 @@
}
],
"source": [
- "%time evaluate(tree_scores(minimizing_tree(max, wordlist, inconsistent=False)))"
+ "%time report(tree_scores(minimizing_tree(max, wordlist, inconsistent=False)))"
]
},
{
@@ -1144,7 +1144,7 @@
}
],
"source": [
- "%time evaluate(tree_scores(minimizing_tree(expectation, wordlist, inconsistent=False)))"
+ "%time report(tree_scores(minimizing_tree(expectation, wordlist, inconsistent=False)))"
]
},
{
@@ -1176,7 +1176,7 @@
}
],
"source": [
- "%time evaluate(tree_scores(minimizing_tree(neg_entropy, wordlist, inconsistent=False)))"
+ "%time report(tree_scores(minimizing_tree(neg_entropy, wordlist, inconsistent=False)))"
]
},
{
@@ -1215,7 +1215,7 @@
}
],
"source": [
- "%time evaluate(play(random_guesser, target, verbose=False) for target in wordlist)"
+ "%time report(play(random_guesser, target, verbose=False) for target in wordlist)"
]
},
{
@@ -1256,7 +1256,7 @@
}
],
"source": [
- "%time evaluate(tree_scores(minimizing_tree(max, wordlist, inconsistent=True)))"
+ "%time report(tree_scores(minimizing_tree(max, wordlist, inconsistent=True)))"
]
},
{
@@ -1288,7 +1288,7 @@
}
],
"source": [
- "%time evaluate(tree_scores(minimizing_tree(expectation, wordlist, inconsistent=True)))"
+ "%time report(tree_scores(minimizing_tree(expectation, wordlist, inconsistent=True)))"
]
},
{
@@ -1320,7 +1320,7 @@
}
],
"source": [
- "%time evaluate(tree_scores(minimizing_tree(neg_entropy, wordlist, inconsistent=True)))"
+ "%time report(tree_scores(minimizing_tree(neg_entropy, wordlist, inconsistent=True)))"
]
},
{
@@ -1333,8 +1333,8 @@
"\n",
"|
Algorithm|Consistent
Only
Mean (Max)|Inconsistent
Allowed
Mean (Max)|\n",
"|--|--|--|\n",
- "|baseline random guesser|7.33 (16)| |\n",
- "|minimize max_counts|7.15 (18)|7.05 (10)|\n",
+ "|random guesser|7.33 (16)| |\n",
+ "|minimize max|7.15 (18)|7.05 (10)|\n",
"|minimize expectation|7.14 (17)|6.84 (10)|\n",
"|minimize neg_entropy|7.09 (19)|6.82 (10)|\n",
"\n",
@@ -1354,7 +1354,7 @@
" - *Yellow* if the guess letter is in the word but in the wrong spot.\n",
" - *Miss* if the letter is not in the word in any spot.\n",
" \n",
- "Since repeated letters and anagrams are allowed, I can use all of `sgb_words` as my list of allowable Wordle words. (Presumably Wordle uses a different list, but this should be not too far off.)\n",
+ "Since repeated letters and anagrams are allowed, I can use all of `sgb_words` as my list of allowable Wordle words.\n",
"\n",
"There seems to be an ambiguity in the rules. Assume the guess is *etude* and the target is *poems*. I think the correct reply should be that one letter *e* is *yellow* and the other is a *miss*, although a strict reading of the rules would say they both should be *yellow*, because both instances of *e* are \"in the word but in the wrong spot.\" I decided that in cases like this I would report the first one as yellow and the second as a miss."
]
@@ -1600,7 +1600,7 @@
}
],
"source": [
- "%time evaluate(play(random_guesser, target, sgb_words, verbose=False) for target in sgb_words)"
+ "%time report(play(random_guesser, target, sgb_words, verbose=False) for target in sgb_words)"
]
},
{
@@ -1633,7 +1633,7 @@
],
"source": [
"%time wtree = minimizing_tree(neg_entropy, sgb_words, sgb_words, inconsistent=False)\n",
- "evaluate(tree_scores(wtree))"
+ "report(tree_scores(wtree))"
]
},
{
@@ -1666,26 +1666,35 @@
],
"source": [
"%time wtree = minimizing_tree(neg_entropy, sgb_words, sgb_words, inconsistent=True)\n",
- "evaluate(tree_scores(wtree))"
+ "report(tree_scores(wtree))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
- "Pretty good! The Wordle web site challenges you to solve each puzzle in six guesses; we can now do that 99.9% of the time when inconsistent guesses are allowed (a big jump from the 95% without inconsistent guesses and the 92% with random consistent guesses).\n",
+ "Pretty good! The Wordle web site challenges you to solve each puzzle in six guesses; we can now do that 99.9% of the time when inconsistent guesses are allowed, a big jump from the 95% without inconsistent guesses and the 92% with random consistent guesses. \n",
"\n",
+ "This is all on the `sgb-words.txt` file. I poked around in the Wordle javascript, and I think I found the word list that they use. If I interpreted it correctly, my algorithm gets these results on it:\n",
+ "\n",
+ " median: 3 guesses, mean: 3.49 ± 0.60, worst: 6, scores: 2,315\n",
+ " cumulative: ≤3:53%, ≤4:96%, ≤5:99.8%, ≤6:100%, ≤7:100%, ≤8:100%, ≤9:100%, ≤10:100%\n",
+ " \n",
+ "I won't post the word list here, because I don't have the author's permission.\n",
"\n",
"# Jotto and Wordle Evaluation Summary\n",
"\n",
- "Here is a summary of the evaluations for both games:\n",
+ "Here is a summary (the first four columns on `sgb-words.txt`, the last on the Wordle word list):\n",
+ "\n",
+ "|
Algorithm|JOTTO
Consistent
Only
Mean (Max)|JOTTO
Inconsistent
Allowed
Mean (Max)|WORDLE
Consistent
Only
Mean (Max)|WORDLE
Inconsistent
Allowed
Mean (Max)|WORDLE
Official
Wordlist
Mean (Max)|\n",
+ "|--|--|--|--|--|--|\n",
+ "|random guesser|7.33 (16)| ––––––– |4.64 (14) | ––––––– | 4.08 (8) |\n",
+ "|minimize max|7.15 (18)|7.05 (10)| ––––––– | ––––––– | ––––––– | ––––––– |\n",
+ "|minimize expectation|7.14 (17)|6.84 (10)| ––––––– | ––––––– | ––––––– | ––––––– |\n",
+ "|minimize neg_entropy|7.09 (19)|6.82 (10)| 4.09 (12) | 3.82 (7) | 3.49 (6) |\n",
+ "\n",
+ "\n",
"\n",
- "|
Algorithm|JOTTO
Consistent
Only
Mean (Max)|JOTTO
Inconsistent
Allowed
Mean (Max)|WORDLE
Consistent
Only
Mean (Max)|WORDLE
Inconsistent
Allowed
Mean (Max)|\n",
- "|--|--|--|--|--|\n",
- "|baseline random guesser|7.33 (16)| –––– |4.64 (14) | ––––\n",
- "|minimize max_counts|7.15 (18)|7.05 (10)| –––– | –––– |\n",
- "|minimize expectation|7.14 (17)|6.84 (10)| –––– | –––– |\n",
- "|minimize neg_entropy|7.09 (19)|6.82 (10)| 4.09 (12) | 3.82 (7) |\n",
"\n",
"# Sample Wordle Games with Minimizing Guesser\n",
"\n",
@@ -1856,21 +1865,33 @@
"source": [
"The best words use popular letters, especially \"e\", \"s\", \"a\", \"r\", \"l\", \"t\".\n",
"\n",
- "The worst words have repeated unpopular letters.\n",
+ "The worst words have repeated unpopular letters, like \"zz\" and \"yukky\".\n",
"\n",
"# Next Steps\n",
"\n",
"There are many directions you could take this if you are interested:\n",
- "- Do the refactoring so that the code can neatly handle multiple different games with different replies, etc.\n",
- "- Run the computations to figure out the best strategy for Wordle.\n",
- "- Rerun the computations with the larger Wordle word lists. If necessary, optimize code first.\n",
- "- Consider game variant where each reply consists of two numbers: the number of letters in common with the target, and the number of letters that are in the exact correct position (as in Mastermind).\n",
- "- Implement Mastermind with 6 colors and 4 pegs, and with other combinations.\n",
- "- What's the best strategy for a chooser who is trying to make the guesser get a bad score. Is there a strategy equilibrium?\n",
- "- Our `minimizing_tree` function is **greedy** in that it guesses the word that minimizes some metric of the current situation without looking ahead to future branches in the tree. Can you get better performance by doing some **look-ahead**? Perhaps with a beam search?\n",
- "- As an alternative to look-ahead, can you improve a tree by editing it? Given a tree, look for interior nodes that end up with a worse-than-expected average score, and see if the node can be replaced with something better (covering the same target words). Correcting a few bad nodes might be faster than carefully searching for good nodes in the first place.\n",
- "- Research what other computer scientists have done with [Jotto](https://arxiv.org/abs/1107.3342) or [Mastermind](http://serkangur.freeservers.com/).\n",
- "- What else can you explore?"
+ "- **Other games:**\n",
+ " - Consider a Jotto game variant where each reply consists of two numbers: the number of letters in common with the target, and the number of letters that are in the exact correct position (as in Mastermind).\n",
+ " - Implement [Mastermind](https://en.wikipedia.org/wiki/Mastermind_%28board_game%29). The default version has 6 colors and 4 pegs. Can you go beyond that?\n",
+ " - Research what other computer scientists have done with [Jotto](https://arxiv.org/abs/1107.3342) or [Mastermind](http://serkangur.freeservers.com/).\n",
+ "- **Better strategy**:\n",
+ " - Our `minimizing_tree` function is **greedy** in that it guesses the word that minimizes some metric of the current situation without looking ahead to future branches in the tree. Can you get better performance by doing some **look-ahead**? Perhaps with a beam search?\n",
+ " - As an alternative to look-ahead, can you improve a tree by editing it? Given a tree, look for interior nodes that end up with a worse-than-expected average score, and see if the node can be replaced with something better (covering the same target words). Correcting a few bad nodes might be faster than carefully searching for good nodes in the first place.\n",
+ " - The metrics max, expectation, and negative entropy are all designed as proxies to what we really want to minimize: the average number of guesses. Can we estimate that directly? For example, we know a branch of size 1 has average 1; of size 2 has average 1.5; and of size 3 has average 1.5 if one of the words partitions the other two, otherwise an average of 2. Can we learn a function that takes a set of words as input and estimates the average number of guesses for the set?\n",
+ " - Is it feasible to do a complete search and find the guaranteed optimal strategy? What optimizations to the code would be necessary? How long would the search take?\n",
+ "- **Code refactoring**:\n",
+ " - Refactor the code so it can smoothly handle multiple different games with different replies, etc.\n",
+ "- **Chooser strategy**:\n",
+ " - Analyze the game where the chooser is not random, but rather is an adversary to the guesser–the chooser tries to choose a word that will maximize the guesser's score. What's a good strategy for the chooser? Is there a strategy equilibrium?\n",
+ "\n",
+ "One thing I thought of is to choose a word for which one of the spots can be filled by many letters, such as: \n",
+ "\n",
+ " bills cills dills fills gills hills jills kills lills mills nills pills rills sills tills vills wills yills zills\n",
+ " aight bight dight eight fight hight kight light might night pight right sight tight wight\n",
+ " backs cacks dacks hacks jacks kacks lacks macks packs racks sacks tacks wacks yacks zacks\n",
+ " bangs cangs dangs fangs gangs hangs kangs mangs pangs rangs sangs tangs vangs wangs yangs\n",
+ " bests fests gests hests jests kests lests nests pests rests tests vests wests yests zests\n",
+ " bines cines dines fines kines lines mines nines pines rines sines tines vines wines zines\n"
]
}
],