Add files via upload

This commit is contained in:
Peter Norvig
2018-02-08 00:27:29 -08:00
committed by GitHub
parent a4ac971d84
commit 5e0ecb0766

View File

@@ -8,7 +8,7 @@
"\n",
"# Bad Grade, Good Experience\n",
"\n",
"Recently I was asked a question I hadn't thought about before: \n",
"Recently I was asked a question I hadn't thought about in decades: \n",
"\n",
"> *As a student, did you ever get a bad grade on a programming assignment?* \n",
"\n",
@@ -20,20 +20,20 @@
"\n",
"After studying Snobol a bit, I realized that the expected solution was along these lines:\n",
"\n",
"1. Create an empty `dict` (Snobol calls these \"tables\") whose keys will be words and values will be lists of line numbers.\n",
"1. Create an empty hash table whose keys will be words and values will be lists of line numbers.\n",
"2. Read the lines of text (tracking the line numbers), split them into words, and build up the list of line numbers for each word.\n",
"3. Convert the table into a two-dimensional `array` where each row has the two columns `[word, line_numbers]`.\n",
"3. Convert the hash table into a two-dimensional array where each row has the two columns `[word, line_numbers]`.\n",
"4. Write a function to sort the array alphabetically (`sort` is not built-in to Snobol).\n",
"5. Write a function to print the array.\n",
"\n",
"That would be around 40 to 60 lines of code; an easy task. But I noticed three interesting things about Snobol:\n",
"\n",
"* There is an *indirection* operator, `$`, so if the variable `'X'` has the value `\"A\"`, then `'$X = i'` is the same as `'A = i'`.\n",
"* Uninitialized variables are treated as the empty string, so `'A += \"text\"'` works even if we haven't seen `'A'` before.\n",
"* When the program ends, the Snobol interpreter automatically\n",
"prints the values of every variable, sorted alphabetically, as a debugging aid.\n",
"* '`$`' is an *indirection* operator, so if the variable `'word'` has the value `\"A\"`, then `'$word = x'` is the same as `'A = x'`.\n",
"* Uninitialized variables are treated as the empty string, so `'A = A + \"text\"'` works even if we haven't seen `'A'` before.\n",
"* When the program ends, the Snobol interpreter \n",
"prints out each variable (in sorted order), with its value, as a debugging aid.\n",
"\n",
"That means I could do away with the `dict` and `array` data structures, eliminating steps 1, 3, 4, and 5, and just do step 2! \n",
"That means I could use `$` to do away with the hash table and array data structures, eliminating steps 1, 3, 4, and 5, and just do step 2! \n",
"\n",
"# The Concordance Solution\n",
"\n",
@@ -50,8 +50,8 @@
"source": [
"program = \"\"\"\n",
"for i, line in enumerate(input):\n",
" for word in re.findall(r\"\\w+\", line.upper()):\n",
" $word += str(i) + ', '\n",
" for word in re.findall(\"[A-Z]+\", line.upper()):\n",
" $word = $word + i + \", \"\n",
"\"\"\""
]
},
@@ -61,38 +61,43 @@
"source": [
"That's just 3 lines, not 40 to 60! \n",
"\n",
"To test the program, I'll write a mock Snobol/Python interpreter, which at heart is just a call to the Python interpreter, `exec(program)`, except that it handles the three things I noticed about the Snobol interpreter:\n",
"To test the program, I'll write a mock Snobol/Python interpreter, which at heart is just a call to the Python interpreter, `exec(program)`, except that it handles the three things I mentioned about the Snobol interpreter, plus one more:\n",
"\n",
"* `$word` gets translated as `_context[word]`.\n",
"* It calls `exec(program, _context)`, where `_context` is a `defaultdict(str)`, so variables default to `''`.\n",
"* After the `exec` completes, the user-defined variables (but not the built-in ones) are printed."
"1. `$word` gets translated as `_globals[word]`.\n",
"2. The interpreter calls `exec(program, _globals)`, where `_globals` is a `defaultdict` that makes variables default to the empty string.\n",
"3. After the `exec` completes, the user-defined variables (but not the built-in ones) are printed.\n",
"4. Concatenating a string with an integer coerces the `int` to `str` automatically. I'll handle that with a `Str` class.\n"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"collapsed": false
},
"metadata": {},
"outputs": [],
"source": [
"from collections import defaultdict\n",
"import re\n",
"\n",
"def snobol(program, data=''):\n",
" \"\"\"A Python interpreter with three Snobol-ish features:\n",
" (1) $word indirection; (2) variables default to ''; (3) post-mortem dump.\"\"\"\n",
" program = re.sub(r'\\$(\\w+)', r'_context[\\1]', program) # (1) \n",
" _context = defaultdict(str, vars(__builtins__)) # (2) \n",
" _context.update(re=re, input=data.splitlines(), _context=_context)\n",
" builtins = set(_context)\n",
" \"\"\"A Python interpreter with four Snobol-ish features:\n",
" 1. $word indirection; 2. variables default to empty string; \n",
" 3. post-mortem dump; 4. automatic coercing to string\"\"\"\n",
" program = re.sub(r'\\$(\\w+)', r'_globals[\\1]', program) # 1. \n",
" _globals = defaultdict(Str, vars(__builtins__)) # 4., 2.\n",
" _globals.update(re=re, input=data.splitlines(), _globals=_globals)\n",
" builtins = set(_globals) | {'__builtins__'}\n",
" try:\n",
" exec(program, _context)\n",
" exec(program, _globals)\n",
" finally:\n",
" print('-' * 79) # (3)\n",
" for name in sorted(_context):\n",
" if not (name in builtins or name == '__builtins__'):\n",
" print('{:10} = {}'.format(name, _context[name]))"
" print('-' * 79) # 3. \n",
" for name in sorted(_globals):\n",
" if name not in builtins:\n",
" print('{:10} = {}'.format(name, _globals[name]))\n",
" \n",
"class Str(str):\n",
" \"String class with automatic coercion for +\"\n",
" def __add__(self, other): return Str(str(self) + str(other))\n",
" def __radd__(self, other): return Str(str(other) + str(self))\n"
]
},
{
@@ -105,9 +110,7 @@
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"collapsed": false
},
"metadata": {},
"outputs": [],
"source": [
"data = \"\"\"\n",
@@ -115,17 +118,17 @@
"Singin' \"Do wah diddy diddy dum diddy do\"\n",
"Snappin' her fingers and shufflin' her feet, \n",
"Singin' \"Do wah diddy diddy dum diddy do\"\n",
"She looked good (looked good), she looked fine (looked fine)\n",
"She looked good, she looked fine and I nearly lost my mind\n",
"She looked good (looked good), \n",
"She looked fine (looked fine)\n",
"She looked good, she looked fine \n",
"And I nearly lost my mind\n",
"\"\"\""
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"collapsed": false
},
"metadata": {},
"outputs": [
{
"name": "stdout",
@@ -133,24 +136,24 @@
"text": [
"-------------------------------------------------------------------------------\n",
"A = 1, \n",
"AND = 3, 6, \n",
"AND = 3, 8, \n",
"DIDDY = 2, 2, 2, 4, 4, 4, \n",
"DO = 2, 2, 4, 4, \n",
"DOWN = 1, \n",
"DUM = 2, 4, \n",
"FEET = 3, \n",
"FINE = 5, 5, 6, \n",
"FINE = 6, 6, 7, \n",
"FINGERS = 3, \n",
"GOOD = 5, 5, 6, \n",
"GOOD = 5, 5, 7, \n",
"HER = 3, 3, \n",
"I = 6, \n",
"I = 8, \n",
"JUST = 1, \n",
"LOOKED = 5, 5, 5, 5, 6, 6, \n",
"LOST = 6, \n",
"MIND = 6, \n",
"MY = 6, \n",
"NEARLY = 6, \n",
"SHE = 1, 5, 5, 6, 6, \n",
"LOOKED = 5, 5, 6, 6, 7, 7, \n",
"LOST = 8, \n",
"MIND = 8, \n",
"MY = 8, \n",
"NEARLY = 8, \n",
"SHE = 1, 5, 6, 7, 7, \n",
"SHUFFLIN = 3, \n",
"SINGIN = 2, 4, \n",
"SNAPPIN = 3, \n",
@@ -160,8 +163,8 @@
"WAH = 2, 4, \n",
"WALKIN = 1, \n",
"WAS = 1, \n",
"i = 6\n",
"line = She looked good, she looked fine and I nearly lost my mind\n",
"i = 8\n",
"line = And I nearly lost my mind\n",
"word = MIND\n"
]
}
@@ -174,15 +177,13 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Oops! The post-mortem printout includes the variables `i`, `line`, and `word`. Reluctantly, I increased the program's line count by 33%:"
"**Oops!** The post-mortem printout includes the variables `i`, `line`, and `word`. Reluctantly, I'll increase the program's line count by 33%:"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"collapsed": false
},
"metadata": {},
"outputs": [
{
"name": "stdout",
@@ -190,24 +191,24 @@
"text": [
"-------------------------------------------------------------------------------\n",
"A = 1, \n",
"AND = 3, 6, \n",
"AND = 3, 8, \n",
"DIDDY = 2, 2, 2, 4, 4, 4, \n",
"DO = 2, 2, 4, 4, \n",
"DOWN = 1, \n",
"DUM = 2, 4, \n",
"FEET = 3, \n",
"FINE = 5, 5, 6, \n",
"FINE = 6, 6, 7, \n",
"FINGERS = 3, \n",
"GOOD = 5, 5, 6, \n",
"GOOD = 5, 5, 7, \n",
"HER = 3, 3, \n",
"I = 6, \n",
"I = 8, \n",
"JUST = 1, \n",
"LOOKED = 5, 5, 5, 5, 6, 6, \n",
"LOST = 6, \n",
"MIND = 6, \n",
"MY = 6, \n",
"NEARLY = 6, \n",
"SHE = 1, 5, 5, 6, 6, \n",
"LOOKED = 5, 5, 6, 6, 7, 7, \n",
"LOST = 8, \n",
"MIND = 8, \n",
"MY = 8, \n",
"NEARLY = 8, \n",
"SHE = 1, 5, 6, 7, 7, \n",
"SHUFFLIN = 3, \n",
"SINGIN = 2, 4, \n",
"SNAPPIN = 3, \n",
@@ -223,8 +224,8 @@
"source": [
"program = \"\"\"\n",
"for i, line in enumerate(input):\n",
" for word in re.findall(r\"\\w+\", line.upper()):\n",
" $word += str(i) + ', '\n",
" for word in re.findall(\"[A-Z]+\", line.upper()):\n",
" $word = $word + i + \", \"\n",
"del i, line, word\n",
"\"\"\"\n",
"\n",
@@ -235,7 +236,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Looks good to me! \n",
"## Looks good to me! \n",
"\n",
"But sadly, the grader for the course did not agree, complaining that my program was not extensible: what if I wanted to cover two or more files in one run? What if I wanted the output to have a slightly different format? I argued that [YAGNI](https://en.wikipedia.org/wiki/You_aren%27t_gonna_need_it), and if the requirements\n",
"changed, *then* I would write the necessary 40 or 60 lines, but there's no sense doing that until then. The grader was not impressed with my arguments and I got points taken off. \n",
@@ -268,7 +269,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.0"
"version": "3.5.3"
}
},
"nbformat": 4,