diff --git a/ipynb/BASIC.ipynb b/ipynb/BASIC.ipynb
index 3224b03..c1d89bc 100644
--- a/ipynb/BASIC.ipynb
+++ b/ipynb/BASIC.ipynb
@@ -4,19 +4,17 @@
"cell_type": "markdown",
"metadata": {},
"source": [
- "
Peter Norvig, Feb 2017
\n",
+ "Peter Norvig, Feb 2017
\n",
"\n",
"# BASIC Interpreter\n",
"\n",
- "[Years ago](http://norvig.com/lispy.html), I showed how to write an Interpreter for a dialect of Lisp. Some readers appreciated it, and some asked about an interpreter for a language that isn't just a bunch of parentheses. In 2014 I saw a [celebration](http://time.com/69316/basic/) of the 50th anniversary of the 1964 [Dartmouth BASIC](http://web.archive.org/web/20120716185629/http://www.bitsavers.org/pdf/dartmouth/BASIC_Oct64.pdf) interpreter, and thought that I could show how to implement such an interpreter. I never quite finished in 2014, but now it is 2017, I rediscovered this unfinished file, and completed it. For those of you unfamiliar with BASIC, here is a sample program:"
+ "[Years ago](http://norvig.com/lispy.html), I showed how to write an Interpreter for a dialect of Lisp. Some readers appreciated it, and some asked about an interpreter for a language that isn't just a bunch of parentheses. In 2014 I saw a [celebration](http://time.com/69316/basic/) of the 50th anniversary of the 1964 [Dartmouth BASIC](http://web.archive.org/web/20120716185629/http://www.bitsavers.org/pdf/dartmouth/BASIC_Oct64.pdf) interpreter, and I thought that I could show how to implement such an interpreter. I never quite finished in 2014, but now it is 2017, I rediscovered this unfinished file, and completed it. For those of you unfamiliar with BASIC, here is a sample program:"
]
},
{
"cell_type": "code",
"execution_count": 1,
- "metadata": {
- "collapsed": true
- },
+ "metadata": {},
"outputs": [],
"source": [
"program = '''\n",
@@ -25,7 +23,7 @@
"15 READ N0, P0\n",
"20 PRINT \"N\",\n",
"25 FOR P = 2 to P0\n",
- "30 PRINT \"N ^\" P,\n",
+ "30 PRINT \"N^\" P,\n",
"35 NEXT P\n",
"40 PRINT \"SUM\"\n",
"45 LET S = 0\n",
@@ -45,53 +43,72 @@
"cell_type": "markdown",
"metadata": {},
"source": [
- "Of course I don't have to build everything from scratch in assembly language, and I don't have to worry about every byte of storage, like [Kemeny](http://www.dartmouth.edu/basicfifty/basic.html), [Gates](http://www.pagetable.com/?p=774), and [Woz](http://www.retrothing.com/2008/07/restoring-wozs.html) did, so my job is much easier. The interpreter consists of three phases: \n",
+ "Of course I don't have to build everything from scratch in assembly language, and I don't have to worry about every byte of storage, like [Kemeny](http://www.dartmouth.edu/basicfifty/basic.html), [Gates](http://www.pagetable.com/?p=774), [Allison](https://en.wikipedia.org/wiki/Tiny_BASIC) and [Woz](http://www.retrothing.com/2008/07/restoring-wozs.html) did, so my job is much easier. The interpreter consists of three phases: \n",
"* **Tokenization**: breaking a text into a list of tokens, for example: `\"10 READ N\"` becomes `['10', 'READ', 'N']`.\n",
- "* **Parsing**: building a representation from the tokens: `Stmt(num=10, typ='READ', args=['N'])`.\n",
- "* **Execution**: follow the flow of the program and do what each statement says; in this case the `READ` statement\n",
- "has the effect of an assignment: `variables['N'] = data.popleft()`.\n",
+ "* **Parsing**: building an internal representation from the tokens: `Stmt(num=10, typ='READ', args=['N'])`.\n",
+ "* **Execution**: do what each statement says; in this case the `READ` statement\n",
+ "assigns the next `DATA` element to `N`.\n",
"\n",
- "\n",
- "\n",
- "# Tokenization\n",
- "\n",
- "One way to turn a line of text into a list of tokens is with the `findall` method of a regular expression that defines all the legal tokens:"
+ "Before covering the three phases, some imports:"
]
},
{
"cell_type": "code",
"execution_count": 2,
- "metadata": {
- "collapsed": true
- },
+ "metadata": {},
"outputs": [],
"source": [
"import re \n",
- "\n",
- "tokenize = re.compile(r'''\n",
- " \\d* \\.? \\d+ (?: E -? \\d+)? | # number \n",
- " SIN|COS|TAN|ATN|EXP|ABS|LOG|SQR|RND|INT|FN[A-Z]| # functions\n",
- " LET|READ|DATA|PRINT|GOTO|IF|FOR|NEXT|END | # keywords\n",
- " DEF|GOSUB|RETURN|DIM|REM|TO|THEN|STEP|STOP | # keywords\n",
- " [A-Z]\\d? | # variable names (letter + optional digit)\n",
- " \".*?\" | # labels (strings in double quotes)\n",
- " <>|>=|<= | # multi-character relational operators\n",
- " \\S # any non-space single character ''', \n",
- " re.VERBOSE).findall"
+ "from typing import *\n",
+ "from collections import namedtuple, defaultdict, deque"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
- "In this version of BASIC, variable names must be either a single letter, or a letter followed by a digit. The only complicated part is the syntax for numbers: optional digits followed by an optional decimal point, some digits, and optionally a power of 10 marked by `\"E\"` and followed by an (optional) minus sign and some digits. \n",
- "Example usage of `tokenize`:"
+ "# Tokenization\n",
+ "\n",
+ "One way to break a line of text into tokens is to define a regular expression that matches every legal token, and use the `findall` method to return a list of tokens. I'll use the `re.VERBOSE` option so that I can add whitespace and comments to make the regular expression more readble. You can refer to [regular expression syntax](https://docs.python.org/3/library/re.html#regular-expression-syntax) if needed."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
+ "outputs": [],
+ "source": [
+ "tokenize = re.compile(r'''\n",
+ " \\d* \\.? \\d+ (?: E -? \\d+)? # number \n",
+ " | SIN|COS|TAN|ATN|EXP|ABS|LOG|SQR|RND|INT|FN[A-Z] # functions\n",
+ " | LET|READ|DATA|PRINT|GOTO|IF|FOR|NEXT|END # statement types\n",
+ " | DEF|GOSUB|RETURN|DIM|REM|STOP # statement types\n",
+ " | TO|THEN|STEP # keywords\n",
+ " | [A-Z]\\d? # variable names (letter + optional digit)\n",
+ " | \".*?\" # label strings (in double quotes)\n",
+ " | <> | >= | <= # multi-character relational operators\n",
+ " | \\S # non-space single character (operators and variables) ''',\n",
+ " re.VERBOSE).findall"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "In this version of BASIC, variable names must be either a single letter, or a letter followed by a digit, and user-defined function names must be \"`FN`\" followed by a single letter. The only complicated part is the syntax for numbers: optional digits followed by an optional decimal point, some digits, and optionally a power of 10 marked by \"`E`\" and followed by an (optional) minus sign and some digits. (Note: The regular expression syntax \"`(?: ... )?`\" means that all of \"`...`\" is optional, but is not the group that the `findall` method captures.)\n",
+ "\n",
+ "Example usage of `tokenize`:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 4,
+ "metadata": {
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
+ },
"outputs": [
{
"data": {
@@ -99,7 +116,7 @@
"['10', 'READ', 'N']"
]
},
- "execution_count": 3,
+ "execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
@@ -110,8 +127,13 @@
},
{
"cell_type": "code",
- "execution_count": 4,
- "metadata": {},
+ "execution_count": 5,
+ "metadata": {
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
+ },
"outputs": [
{
"data": {
@@ -119,7 +141,7 @@
"['100', 'PRINT', '\"SIN(X)^2 = \"', ',', 'SIN', '(', 'X', ')', '^', '2']"
]
},
- "execution_count": 4,
+ "execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
@@ -134,19 +156,25 @@
"source": [
"That looks good. Note that my tokens are just strings; it will be the parser's job, not the tokenizer's, to recognize that `'2'` is a number and `'X'` is the name of a variable. (In some interpreters, the tokenizer makes distinctions like these.)\n",
"\n",
- "There's one important complication: spaces don't matter in BASIC programs, so the following should all be equivalent:\n",
+ "There's one important complication: the manual says that \"spaces have no significance in BASIC (except in messages to be printed...)\" Thus, the following should all be equivalent:\n",
"\n",
" 10 GOTO 99\n",
- " 10GOTO99\n",
" 10 GO TO 99\n",
+ " 10GOTO99\n",
+ " 1 0 GOT O9 9\n",
" \n",
- "The problem is that `tokenize` gets the last one wrong:"
+ "The problem is that `tokenize` gets some of these wrong:"
]
},
{
"cell_type": "code",
- "execution_count": 5,
- "metadata": {},
+ "execution_count": 6,
+ "metadata": {
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
+ },
"outputs": [
{
"data": {
@@ -154,7 +182,7 @@
"['10', 'G', 'O', 'TO', '99']"
]
},
- "execution_count": 5,
+ "execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
@@ -163,59 +191,6 @@
"tokenize('10 GO TO 99')"
]
},
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "My first thought was to remove all white space from the input. That would work for this example, but it would change the token `\"HELLO WORLD\"` to `\"HELLOWORLD\"`, which is wrong. To remove spaces everywhere *except* between double quotes, I can tokenize the line and join the tokens back together. Then I can re-tokenize to get the final list of tokens; I do that in my new function below called `tokenizer`. \n",
- "\n",
- "Once I have a list of tokens, I access them through this interface:\n",
- "* `peek()`: returns the next token in `tokens` (without changing `tokens`), or `None` if there are no more tokens.\n",
- "* `pop()`: removes and returns the next token. \n",
- "* `pop(`*string*`)`: removes and returns the next token if it is equal to the string; else return `None` and leave `tokens` unchanged.\n",
- "* `pop(`*predicate*`)`: remove and return the next token if *predicate*(*token*) is true; else return `None`, leave `tokens` alone."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 6,
- "metadata": {
- "collapsed": true
- },
- "outputs": [],
- "source": [
- "tokens = [] # Global variable to hold a list of tokens\n",
- "\n",
- "def tokenizer(line):\n",
- " \"Return a list of the tokens on this line, handling spaces properly, and upper-casing.\"\n",
- " line = ''.join(tokenize(line)) # Remove whitespace\n",
- " return tokenize(line.upper())\n",
- "\n",
- "def peek(): \n",
- " \"Return the first token in the global `tokens`, or None if we are at the end of the line.\"\n",
- " return (tokens[0] if tokens else None)\n",
- "\n",
- "def pop(constraint=None):\n",
- " \"\"\"Remove and return the first token in `tokens`, or return None if token fails constraint.\n",
- " constraint can be None, a literal (e.g. pop('=')), or a predicate (e.g. pop(is_varname)).\"\"\"\n",
- " top = peek()\n",
- " if constraint is None or (top == constraint) or (callable(constraint) and constraint(top)):\n",
- " return tokens.pop(0)\n",
- "\n",
- "def lines(text): \n",
- " \"A list of the non-empty lines in a text.\"\n",
- " return [line for line in text.splitlines() if line]"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "(Note: if I expected program lines to contain many tokens, I would use a `deque` instead of a `list` of tokens.) \n",
- "\n",
- "We can test `tokenizer` and the related functions:"
- ]
- },
{
"cell_type": "code",
"execution_count": 7,
@@ -224,7 +199,7 @@
{
"data": {
"text/plain": [
- "'ok'"
+ "['1', '0', 'G', 'O', 'T', 'O9', '9']"
]
},
"execution_count": 7,
@@ -233,24 +208,122 @@
}
],
"source": [
- "def test_tokenizer():\n",
- " global tokens\n",
- " assert tokenizer('X-1') == ['X', '-', '1'] # Numbers don't have a leading minus sign, so this isn't ['X', '-1']\n",
+ "tokenize('1 0 GOT O9 9')"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "My first thought was to remove all white space from the input. But that would change the label string `\"HELLO WORLD\"` to `\"HELLOWORLD\"`, which is wrong. To remove spaces everywhere *except* between double quotes, I can tokenize the line, then join the tokens back together without spaces, then re-tokenize to get the final list of tokens. I do that in my new function below called `tokenizer`. "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 8,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "def tokenizer(line: str) -> List[str]:\n",
+ " \"Return a list of the tokens on this line, handling spaces properly, and upper-casing.\"\n",
+ " line = ''.join(tokenize(line)) # Remove whitespace\n",
+ " return tokenize(line.upper())"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# Peeking at and Popping Tokens\n",
+ "\n",
+ "I will store my list of tokens in the global variable `TOKENS` and access them through this interface:\n",
+ "* `peek()`: return the next token in `TOKENS` (without changing `TOKENS`), or `None` if there are no more tokens.\n",
+ "* `pop()`: remove and return the next token in `TOKENS`. \n",
+ "* `pop(\"string\")`: remove and return the next token if it is equal to the string; else return `None` and leave `TOKENS` unchanged.\n",
+ "* `pop(predicate)`: remove and return the next token if `predicate(token)` is true; else return `None` and leave `TOKENS` unchanged."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 9,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "Token = str # Type: a Token is a string\n",
+ "\n",
+ "TOKENS = [] # Global variable to hold a list of TOKENS\n",
+ "\n",
+ "def peek() -> Optional[Token]: \n",
+ " \"Return the first token in the global `TOKENS`, or None if we are at the end of the line.\"\n",
+ " return (TOKENS[0] if TOKENS else None)\n",
+ "\n",
+ "def pop(constraint=None) -> Optional[Token]:\n",
+ " \"\"\"Remove and return the first token in `TOKENS`, or return None if token fails constraint.\n",
+ " constraint can be None, a literal (e.g. pop('=')), or a predicate (e.g. pop(is_varname)).\"\"\"\n",
+ " token = peek()\n",
+ " if constraint is None or (token == constraint) or (callable(constraint) and constraint(token)):\n",
+ " return TOKENS.pop(0)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "(Note: if I expected program lines to contain many tokens, I would use a `deque` instead of a `list` of tokens.) \n",
+ "\n",
+ "# Testing the Tokenizer\n",
+ "\n",
+ "We can test `tokenizer` and the related functions:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 10,
+ "metadata": {
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "True"
+ ]
+ },
+ "execution_count": 10,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "def lines(text: str) -> List[str]: \n",
+ " \"A list of the non-empty lines in a text.\"\n",
+ " return [line for line in text.splitlines() if line]\n",
+ " \n",
+ "def test_tokenizer() -> bool:\n",
+ " global TOKENS\n",
+ " assert tokenizer('X-1') == ['X', '-', '1'] \n",
+ " assert tokenizer('-2') == ['-', '2']\n",
" assert tokenizer('PRINT \"HELLO WORLD\"') == ['PRINT', '\"HELLO WORLD\"']\n",
- " assert tokenizer('10 GOTO 99') == tokenizer('10GOTO99') == tokenizer('10 GO TO 99') == ['10', 'GOTO', '99']\n",
+ " assert (tokenizer('10 GOTO 99') == tokenizer('10 GO TO 99') == tokenizer('10GOTO99') == \n",
+ " tokenizer('1 0 GOT O9 9') == ['10', 'GOTO', '99'])\n",
" assert (tokenizer('100 PRINT \"HELLO WORLD\", SIN(X) ^ 2') == \n",
" ['100', 'PRINT', '\"HELLO WORLD\"', ',', 'SIN', '(', 'X', ')', '^', '2'])\n",
" assert (tokenizer('100IFX1+123.4+E1-12.3E4 <> 1.2E-34*-12E34+1+\"HI\" THEN99') ==\n",
" ['100', 'IF', 'X1', '+', '123.4', '+', 'E1', '-', '12.3E4', '<>', \n",
" '1.2E-34', '*', '-', '12E34', '+', '1', '+', '\"HI\"', 'THEN', '99'])\n",
" assert lines('one line') == ['one line']\n",
+ " assert lines('''two\n",
+ " lines''') == ['two', ' lines']\n",
" assert lines(program) == [\n",
" '10 REM POWER TABLE',\n",
" '11 DATA 8, 4',\n",
" '15 READ N0, P0',\n",
" '20 PRINT \"N\",',\n",
" '25 FOR P = 2 to P0',\n",
- " '30 PRINT \"N ^\" P,',\n",
+ " '30 PRINT \"N^\" P,',\n",
" '35 NEXT P',\n",
" '40 PRINT \"SUM\"',\n",
" '45 LET S = 0',\n",
@@ -269,7 +342,7 @@
" ['15', 'READ', 'N0', ',', 'P0'],\n",
" ['20', 'PRINT', '\"N\"', ','],\n",
" ['25', 'FOR', 'P', '=', '2', 'TO', 'P0'],\n",
- " ['30', 'PRINT', '\"N ^\"', 'P', ','],\n",
+ " ['30', 'PRINT', '\"N^\"', 'P', ','],\n",
" ['35', 'NEXT', 'P'],\n",
" ['40', 'PRINT', '\"SUM\"'],\n",
" ['45', 'LET', 'S', '=', '0'],\n",
@@ -283,7 +356,7 @@
" ['85', 'NEXT', 'N'],\n",
" ['99', 'END']]\n",
"\n",
- " tokens = tokenizer('10 GO TO 99') \n",
+ " TOKENS = tokenizer('10 GO TO 99') \n",
" assert peek() == '10'\n",
" assert pop() == '10'\n",
" assert peek() == 'GOTO'\n",
@@ -293,50 +366,13 @@
" assert pop('98.6') == None # '99' is not '98.6'\n",
" assert peek() == '99'\n",
" assert pop(str.isnumeric) == '99' # '99' is numeric\n",
- " assert peek() is None and not tokens \n",
+ " assert peek() is None and not TOKENS \n",
" \n",
- " return 'ok'\n",
+ " return True\n",
" \n",
"test_tokenizer()"
]
},
- {
- "cell_type": "code",
- "execution_count": 8,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "['10 REM POWER TABLE',\n",
- " '11 DATA 8, 4',\n",
- " '15 READ N0, P0',\n",
- " '20 PRINT \"N\",',\n",
- " '25 FOR P = 2 to P0',\n",
- " '30 PRINT \"N ^\" P,',\n",
- " '35 NEXT P',\n",
- " '40 PRINT \"SUM\"',\n",
- " '45 LET S = 0',\n",
- " '50 FOR N = 2 TO N0',\n",
- " '55 PRINT N,',\n",
- " '60 FOR P = 2 TO P0',\n",
- " '65 LET S = S + N ^ P',\n",
- " '70 PRINT N ^ P,',\n",
- " '75 NEXT P',\n",
- " '80 PRINT S',\n",
- " '85 NEXT N',\n",
- " '99 END']"
- ]
- },
- "execution_count": 8,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "lines(program)"
- ]
- },
{
"cell_type": "markdown",
"metadata": {},
@@ -345,8 +381,8 @@
"\n",
"Parsing is the process of transforming the text of a program into an internal representation, which can then be executed.\n",
"For BASIC, the representation will be an ordered list of statements, and we'll need various data types to represent the parts of the statements.\n",
- "I'll start by showing the grammar of BASIC statements, as seen on pages 56-57 of [the manual](http://web.archive.org/web/20120716185629/http://www.bitsavers.org/pdf/dartmouth/BASIC_Oct64.pdf) (see also pages 26-30 for a simpler introduction). A statement starts with a line number, and then can be one of the following 15 types of statements, each \n",
- "type introduced by a distinct keyword:\n",
+ "I'll start by showing the grammar of BASIC statements, as seen in the appendix on pages 56-57 of [the manual](http://web.archive.org/web/20120716185629/http://www.bitsavers.org/pdf/dartmouth/BASIC_Oct64.pdf) (see also pages 26-30 for a simpler introduction). A statement starts with a line number, and then can be one of the following 15 types of statements, each \n",
+ "type introduced by a distinct keyword and followes by components that are specific to the type:\n",
"\n",
"- **`LET`** `` **=** ``\n",
"- **`READ`** ``\n",
@@ -354,28 +390,28 @@
"- **`PRINT`** ``\n",
"- **`GOTO`** ``\n",
"- **`IF`** ` ` **`THEN`** ``\n",
- "- **`FOR`** `` **=** `` **`TO`** ` [`**`STEP`** `]`\n",
+ "- **`FOR`** `` **=** `` **`TO`** ` [`**`STEP`** ` ]`\n",
"- **`NEXT`** ``\n",
"- **`END`**\n",
"- **`STOP`**\n",
- "- **`DEF`** ``**(**``**) = **``\n",
+ "- **`DEF`** `` **(** `` **)** **=** ``\n",
"- **`GOSUB`** ``\n",
"- **`RETURN`**\n",
"- **`DIM`** ``\n",
"- **`REM`** ``\n",
" \n",
- "The notation `` means any variable and `` means zero or more variables, separated by commas. `[`**`STEP`** `]` means that the literal string `\"STEP\"` followed by an expression is optional. \n",
+ "The notation `` means any variable and `` means zero or more variables, separated by commas. `[`**`STEP`** ` ]` means that the literal string `\"STEP\"` followed by an expression is optional. **Bold** characters must be matched exactly.\n",
"\n",
"Rather than use one of the many [language parsing frameworks](https://wiki.python.org/moin/LanguageParsing), I will show how to build a parser from scratch. First I'll translate the grammar above into Python. Not character-for-character (because it would take a lot of work to get Python to understand how to handle those characters), but almost word-for-word (because I can envision a straightforward way to get Python to handle the following format):"
]
},
{
"cell_type": "code",
- "execution_count": 9,
+ "execution_count": 11,
"metadata": {},
"outputs": [],
"source": [
- "def Grammar(): \n",
+ "def Grammar() -> Dict[Token, list]: \n",
" return {\n",
" 'LET': [variable, '=', expression],\n",
" 'READ': [list_of(variable)],\n",
@@ -401,43 +437,38 @@
"source": [
"# Parsing Strategy\n",
"\n",
- "The grammar of BASIC is designed so that at every point, the next token tells us unambiguously how to parse. For example, the first token after the line number defines the type of statement; also, in an expression we know that all three-letter names are functions while all 1-letter names are variables. So in writing the various grammatical category functions, a common pattern is to either `peek()` at the next token or try a `pop(`*constraint*`)`, and from that decide what to parse next, and never have to back up or undo a `pop()`. Here is my strategy for parsing statements:\n",
+ "A program is parsed into a list of statements, each statement an object of type `Stmt`. A statement contains various components: a line number, a statement type, and various parts such as numbers, variable names, subscripted variable references, user-defined functions, function calls, operation calls, and labels.\n",
+ "\n",
+ "The grammar of BASIC is designed so that at every point, the next token tells us unambiguously how to parse. For example, the first token after the line number defines the type of statement; also, in an expression we know that all three-letter names are functions while all 1-letter names are variables. So the next token will always determine what to parse next, and we will never have to back up and undo a `pop()`. Here is my strategy for parsing statements:\n",
"\n",
"* The grammatical categories, like `variable` and `expression` (and also `statement`), will be defined as functions\n",
- "(with no argument) that pop tokens from the global variable `tokens`, and return a data object. For example, calling `linenumber()` will pop a token, convert it to an `int`, and return that. \n",
- "\n",
- "* Consider parsing the statement `\"20 LET X = X + 1\"`. \n",
- "\n",
- "* First tokenize to get: `tokens = ['20', 'LET', 'X', '=', 'X', '+', '1']`.\n",
- "\n",
+ "(with no argument) that pop tokens from the global variable `TOKENS`, and return a data object. For example, calling `linenumber()` will pop a token, convert it to an `int`, and return that. \n",
+ "* As an example, consider parsing the statement `\"20 LET X = X + 1\"`. \n",
+ "* First tokenize to get: `TOKENS = ['20', 'LET', 'X', '=', 'X', '+', '1']`.\n",
"* Then call `statement()` (defined below).\n",
- "\n",
- " * `statement` first calls `linenumber()`, getting back the integer `20` (and removing `'20'` from `tokens`).\n",
- "\n",
- " * Then it calls `pop()` to get `'LET'` (and removing `'LET'` from `tokens`).\n",
+ " * `statement` first calls `linenumber()`, getting back the integer `20` (and removing `'20'` from `TOKENS`).\n",
+ " * Then it calls `pop()` to get `'LET'` (removing `'LET'` from `TOKENS`).\n",
" * Then it indexes into the grammar with `'LET'`, retrieving the grammar rule `[variable, '=', expression]`.\n",
- "\n",
" * Then it processes the 3 constituents of the grammar rule:\n",
" * First, call `variable()`, which removes and returns `'X'`.\n",
- " * Second, call `pop('=')`, which removes `'='` from `tokens`, and discard it.\n",
- " * Third, call `expression()`, which returns a representation of `X + 1`; let's write that as `Opcall('X', '+', 1.0)`.\n",
- "\n",
+ " * Second, call `pop('=')`, which removes `'='` from `TOKENS`, and discard it.\n",
+ " * Third, call `expression()`, which returns a representation of `X + 1`; we write that as `Opcall('X', '+', 1.0)`.\n",
" * Finally, `statement` assembles the pieces and returns `Stmt(num=20, typ='LET', args=['X', Opcall('X', '+', 1.0)])`.\n",
- "* If anything goes wrong, call `fail(\"`*error message*`\")`, which raises an error.\n",
+ "* If anything goes wrong, call `fail`, which raises an error.\n",
"\n",
- "Here is the definition of `statement`:\n"
+ "Here is the definition of the `Stmt` type and the `statement` grammatical category function:\n"
]
},
{
"cell_type": "code",
- "execution_count": 10,
- "metadata": {
- "collapsed": true
- },
+ "execution_count": 12,
+ "metadata": {},
"outputs": [],
"source": [
- "def statement():\n",
- " \"Parse a BASIC statement from `tokens`.\"\n",
+ "Stmt = namedtuple('Stmt', 'num, typ, args') # E.g. '10 GOTO 999' => Stmt(10, 'GOTO', 999)\n",
+ "\n",
+ "def statement() -> Stmt:\n",
+ " \"Parse a BASIC statement from the global variable `TOKENS`.\"\n",
" num = linenumber()\n",
" typ = pop(is_stmt_type) or fail('unknown statement type')\n",
" args = []\n",
@@ -453,22 +484,22 @@
"cell_type": "markdown",
"metadata": {},
"source": [
- "Some of the grammatical categories, like `expression`, are complex. But many of the categories are easy one-liners:"
+ "Many of the grammatical categories are easy one-liners:"
]
},
{
"cell_type": "code",
- "execution_count": 11,
+ "execution_count": 13,
"metadata": {},
"outputs": [],
"source": [
- "def number(): return (-1 if pop('-') else +1) * float(pop()) # Optional minus sign\n",
- "def step(): return (expression() if pop('STEP') else 1) # 1 is the default step\n",
- "def linenumber(): return (int(pop()) if peek().isnumeric() else fail('missing line number'))\n",
+ "def number(): return (-number() if pop('-') else float(pop())) # Optional minus sign\n",
+ "def step(): return (expression() if pop('STEP') else 1) # 1 is the default step\n",
+ "def linenumber(): return int(pop(is_number) or fail('missing line number'))\n",
"def relational(): return pop(is_relational) or fail('expected a relational operator')\n",
"def varname(): return pop(is_varname) or fail('expected a variable name')\n",
"def funcname(): return pop(is_funcname) or fail('expected a function name')\n",
- "def anycharacters(): tokens.clear() # Ignore tokens in a REM statement"
+ "def anycharacters(): TOKENS.clear() # Ignore tokens in a REM statement"
]
},
{
@@ -480,39 +511,39 @@
},
{
"cell_type": "code",
- "execution_count": 12,
- "metadata": {
- "collapsed": true
- },
+ "execution_count": 14,
+ "metadata": {},
"outputs": [],
"source": [
- "def is_stmt_type(x): return is_str(x) and x in grammar # LET, READ, ...\n",
- "def is_funcname(x): return is_str(x) and len(x) == 3 and x.isalpha() # SIN, COS, FNA, FNB, ...\n",
- "def is_varname(x): return is_str(x) and len(x) in (1, 2) and x[0].isalpha() # A, A1, A2, B, ...\n",
- "def is_label(x): return is_str(x) and x.startswith('\"') # \"HELLO WORLD\", ...\n",
- "def is_relational(x): return is_str(x) and x in ('<', '=', '>', '<=', '<>', '>=')\n",
- "def is_number(x): return is_str(x) and x and x[0] in '.0123456789' # '3', '.14', ...\n",
+ "def is_stmt_type(x) -> bool: return isa(x, Token) and x in grammar # LET, READ, ...\n",
+ "def is_funcname(x) -> bool: return isa(x, Token) and len(x) == 3 and x.isalpha() # SIN, COS, FNA, FNB, ...\n",
+ "def is_varname(x) -> bool: return isa(x, Token) and len(x) in (1, 2) and x[0].isalpha() # A, A1, A2, B, ...\n",
+ "def is_label(x) -> bool: return isa(x, Token) and x.startswith('\"') # \"HELLO WORLD\", ...\n",
+ "def is_relational(x) -> bool: return isa(x, Token) and x in ('<', '=', '>', '<=', '<>', '>=')\n",
+ "def is_number(x) -> bool: return isa(x, Token) and x and x[0] in '.0123456789'\n",
"\n",
- "def is_str(x): return isinstance(x, str)"
+ "isa = isinstance"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
- "Note that `varname` means an unsubscripted variable name (a letter by itself, like `X`, or followed by a digit, like `X3`), and that `variable` is a `varname` optionally followed by index expressions in parentheses, like `A(I)` or `M(2*I, 3)`: "
+ "Note that `varname` is what the manual calls \"unsubscripted variable name\" (a letter by itself, like `X`, or followed by a digit, like `X3`), and that `variable` is a `varname` optionally followed by one or two subscript expressions in parentheses, like `A(I)` or `M(2*I, J)`: "
]
},
{
"cell_type": "code",
- "execution_count": 13,
- "metadata": {
- "collapsed": true
- },
+ "execution_count": 15,
+ "metadata": {},
"outputs": [],
"source": [
- "def variable(): \n",
- " \"Parse a possibly subscripted variable e.g. 'X3' or 'A(I)' or 'M(2*I, 3)'.\"\n",
+ "Varname = str # Type for a variable name such as `V` or `X3`\n",
+ "Subscript = namedtuple('Subscript', 'var, indexes') # E.g. 'A(2, 3)' => Subscript('A', [2, 3])\n",
+ "Variable = Union[Varname, Subscript]\n",
+ "\n",
+ "def variable() -> Variable: \n",
+ " \"Parse a possibly subscripted variable e.g. 'X3' or 'A(I)' or 'M(I, J)'.\"\n",
" V = varname()\n",
" if pop('('):\n",
" indexes = list_of(expression)()\n",
@@ -531,17 +562,15 @@
},
{
"cell_type": "code",
- "execution_count": 14,
- "metadata": {
- "collapsed": true
- },
+ "execution_count": 16,
+ "metadata": {},
"outputs": [],
"source": [
"class list_of:\n",
" \"list_of(category) is a callable that parses a comma-separated list of \"\n",
" def __init__(self, category): self.category = category\n",
- " def __call__(self):\n",
- " result = ([self.category()] if tokens else [])\n",
+ " def __call__(self) -> list:\n",
+ " result = ([self.category()] if TOKENS else [])\n",
" while pop(','):\n",
" result.append(self.category())\n",
" return result"
@@ -553,20 +582,16 @@
"source": [
"# Parsing: Top Level `parse`, and Handling Errors\n",
"\n",
- "Most of the parsing action happens inside the function `statement()`, but at the very top level, `parse(program)` takes a program text (that is, a string), and parses each line by calling `parse_line`, sorting the resulting list of lines by line number. If we didn't have to handle errors, this would be simple:"
+ "Most of the parsing action happens inside the function `statement()`, but at the very top level, `parse(program)` takes a program text (that is, a string), and parses each line by calling `parse_line`, sorting the resulting list of lines by line number:"
]
},
{
"cell_type": "code",
- "execution_count": 15,
- "metadata": {
- "collapsed": true
- },
+ "execution_count": 17,
+ "metadata": {},
"outputs": [],
"source": [
- "def parse(program): return sorted(parse_line(line) for line in lines(program))\n",
- "\n",
- "def parse_line(line): global tokens; tokens = tokenizer(line); return statement()"
+ "def parse(program: str) -> List[Stmt]: return sorted(map(parse_line, lines(program)))"
]
},
{
@@ -578,79 +603,39 @@
},
{
"cell_type": "code",
- "execution_count": 16,
- "metadata": {
- "collapsed": true
- },
+ "execution_count": 18,
+ "metadata": {},
"outputs": [],
"source": [
- "def parse_line(line):\n",
+ "def parse_line(line: str) -> Stmt:\n",
" \"Return a Stmt(linenumber, statement_type, arguments).\"\n",
- " global tokens\n",
- " tokens = tokenizer(line)\n",
+ " global TOKENS\n",
+ " TOKENS = tokenizer(line)\n",
" try:\n",
" stmt = statement()\n",
- " if tokens: fail('extra tokens at end of line')\n",
+ " if TOKENS: \n",
+ " fail('extra TOKENS at end of line')\n",
" return stmt\n",
" except SyntaxError as err:\n",
- " print(\"Error in line '{}' at '{}': {}\".format(line, ' '.join(tokens), err))\n",
- " return Stmt(0, 'REM', []) # Return dummy statement\n",
+ " error = f\"Error in line '{line}' at '{' '.join(TOKENS)}': {err}\"\n",
+ " print(error)\n",
+ " return Stmt(0, 'REM', [error]) # Return dummy statement\n",
" \n",
"def fail(message): raise SyntaxError(message)"
]
},
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "# Parsing: Building A Representation of the Program\n",
- "\n",
- "A program is represented by various data structures: a list of statements, where each statement contains various components: subscripted variable references, user-defined functions, function calls, operation calls, variable names, numbers, and labels. Here I define these data structures with `namedtuple`s:"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 17,
- "metadata": {
- "collapsed": true
- },
- "outputs": [],
- "source": [
- "from collections import namedtuple, defaultdict, deque\n",
- "\n",
- "Stmt = namedtuple('Stmt', 'num, typ, args') # '1 GOTO 9' => Stmt(1, 'GOTO', 9)\n",
- "Subscript = namedtuple('Subscript', 'var, indexes') # 'A(I)' => Subscript('A', ['I'])\n",
- "Funcall = namedtuple('Funcall', 'f, x') # 'SQR(X)' => Funcall('SQR', 'X')\n",
- "Opcall = namedtuple('Opcall', 'x, op, y') # 'X + 1' => Opcall('X', '+', 1)\n",
- "ForState = namedtuple('ForState', 'continu, end, step') # Data for FOR loop \n",
- "\n",
- "class Function(namedtuple('_', 'parm, body')):\n",
- " \"User-defined function; 'DEF FNC(X) = X ^ 3' => Function('X', Opcall('X', '^', 3))\"\n",
- " def __call__(self, value): \n",
- " variables[self.parm] = value # Global assignment to the parameter\n",
- " return evalu(self.body)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "The first four namedtuples should be self-explanatory. The next one, `ForState`, is used to represent the state of a `FOR` loop variable while the program is running, but does not appear in the static representation of the program.\n",
- "`Function` is used to represent the definition of a user defined function. When the user writes `\"DEF FNC(X) = X ^ 3\"`, we create an object with `Function(parm='X', body=Opcall('X', '^', 3))`, and whenever the program calls, say, `FNC(2)` in an expression, the call returns 8, and also assigns 2 to the *global* variable `X` (whereas in modern languages, it would temporarily bind a new *local* variable named `X`). BASIC has no local variables. Note the technique of making `Function` be a subclass of a `namedtuple`; we are then free to add the `__call__` method to the subclass."
- ]
- },
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Parsing: Grammar of `PRINT` Statements\n",
"\n",
- "On page 26 of the manual, it appears that the grammar rule for `PRINT` should be `[list_of(expression)]`. But in section 3.1, **More about PRINT**, some complications are introduced:\n",
+ "On page 26 of [the manual](https://web.archive.org/web/20120716185629/http://www.bitsavers.org/pdf/dartmouth/BASIC_Oct64.pdf), it appears that the grammar rule for `PRINT` should be `[list_of(expression)]`. But in section 3.1, **More about PRINT**, some complications are introduced:\n",
"\n",
"* Labels (strings enclosed in double quotes) are allowed, as well as expressions.\n",
"* The `\",\"` is not a separator. A line can end with `\",\"`.\n",
"* Optionally, `\";\"` can be used instead of `\",\"`.\n",
- "* Optionally, the `\",\"` or `\";\"` can be omitted—we can have a label immediately followed by an expression.\n",
+ "* Optionally, the `\",\"` or `\";\"` after a label can be omitted—we can have a label immediately followed by an expression.\n",
"\n",
"The effect of a comma is to advance the output to the next column that is a multiple of 15 (and to a new line if this goes past column 100). The effect of a semicolon is similar, but works in multiples of 3, not 15. (Note that column numbering starts at 0, not 1.) Normally, at the end of a `PRINT` statement we advance to a new line, but this is not done if the statement ends in `\",\"` or `\";\"`. Here are some examples:\n",
"\n",
@@ -673,18 +658,15 @@
},
{
"cell_type": "code",
- "execution_count": 18,
- "metadata": {
- "collapsed": true
- },
+ "execution_count": 19,
+ "metadata": {},
"outputs": [],
"source": [
- "def labels_and_expressions():\n",
- " \"Parse a sequence of label / comma / semicolon / expression (for PRINT statement).\"\n",
+ "def labels_and_expressions() -> list:\n",
+ " \"Parse a sequence of label | comma | semicolon | expression (for PRINT statement).\"\n",
" result = []\n",
- " while tokens:\n",
- " item = pop(is_label) or pop(',') or pop(';') or expression()\n",
- " result.append(item)\n",
+ " while TOKENS:\n",
+ " result.append(pop(is_label) or pop(',') or pop(';') or expression())\n",
" return result"
]
},
@@ -714,13 +696,13 @@
},
{
"cell_type": "code",
- "execution_count": 19,
- "metadata": {
- "collapsed": true
- },
+ "execution_count": 20,
+ "metadata": {},
"outputs": [],
"source": [
- "def expression(prec=1): \n",
+ "Opcall = namedtuple('Opcall', 'x, op, y') # E.g., 'X + 1' => Opcall('X', '+', 1)\n",
+ "\n",
+ "def expression(prec=1) -> object: \n",
" \"Parse an expression: a primary and any [op expression]* pairs with precedence(op) >= prec.\"\n",
" exp = primary() # 'A' => 'A'\n",
" while precedence(peek()) >= prec:\n",
@@ -729,7 +711,7 @@
" exp = Opcall(exp, op, rhs) # 'A + B' => Opcall('A', '+', 'B')\n",
" return exp\n",
"\n",
- "def primary():\n",
+ "def primary() -> object:\n",
" \"Parse a primary expression (no infix op except maybe within parens).\"\n",
" if is_number(peek()): # '1.23' => 1.23 \n",
" return number()\n",
@@ -746,13 +728,39 @@
" else:\n",
" return fail('unknown expression')\n",
"\n",
- "def precedence(op): \n",
+ "def precedence(op) -> int: \n",
" return (3 if op == '^' else 2 if op in ('*', '/', '%') else 1 if op in ('+', '-') else 0)\n",
"\n",
- "def associativity(op): \n",
+ "def associativity(op) -> int: \n",
" return (0 if op == '^' else 1)"
]
},
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# Parsing: Function Definitions and Calls\n",
+ "\n",
+ "The class `Function` is used to represent the definition of a user defined function. When the user writes `\"DEF FNC(X) = X ^ 3\"`, we specify the function object with `Function(parm='X', body=Opcall('X', '^', 3))`, and whenever the program calls, say, `FNC(2)` in an expression, the call assigns 2 to the **global** variable `X` and computes 8 as the value to return. (In modern languages, `X` would be a new **local** variable, but BASIC has no local variables.) Note the technique of making `Function` be a subclass of a `namedtuple`; we are then free to add the `__call__` method to the subclass.\n",
+ "\n",
+ "Note the distinction that `Function` is used to represent a user-defined function such as `FNC`, but `Funcall` is used to represent a function call such as `FNC(2)`."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 21,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "Funcall = namedtuple('Funcall', 'function, arg') # E.g., 'SIN(X)' => Funcall('SIN', 'X')\n",
+ "\n",
+ "class Function(namedtuple('_', 'parm, body')):\n",
+ " \"User-defined function; 'DEF FNC(X) = X ^ 3' => Function('X', Opcall('X', '^', 3))\"\n",
+ " def __call__(self, value): \n",
+ " variables[self.parm] = value # Global assignment to the parameter\n",
+ " return evaluate(self.body)"
+ ]
+ },
{
"cell_type": "markdown",
"metadata": {},
@@ -764,8 +772,13 @@
},
{
"cell_type": "code",
- "execution_count": 20,
- "metadata": {},
+ "execution_count": 22,
+ "metadata": {
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
+ },
"outputs": [
{
"data": {
@@ -775,7 +788,7 @@
" Stmt(num=15, typ='READ', args=[['N0', 'P0']]),\n",
" Stmt(num=20, typ='PRINT', args=[['\"N\"', ',']]),\n",
" Stmt(num=25, typ='FOR', args=['P', 2.0, 'P0', 1]),\n",
- " Stmt(num=30, typ='PRINT', args=[['\"N ^\"', 'P', ',']]),\n",
+ " Stmt(num=30, typ='PRINT', args=[['\"N^\"', 'P', ',']]),\n",
" Stmt(num=35, typ='NEXT', args=['P']),\n",
" Stmt(num=40, typ='PRINT', args=[['\"SUM\"']]),\n",
" Stmt(num=45, typ='LET', args=['S', 0.0]),\n",
@@ -790,7 +803,7 @@
" Stmt(num=99, typ='END', args=[])]"
]
},
- "execution_count": 20,
+ "execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
@@ -805,35 +818,43 @@
"cell_type": "markdown",
"metadata": {},
"source": [
+ "You might think that the statement \"`11 DATA 8, 4`\" would be parsed into a `Stmt` with `args=[8, 4]`. It has two arguments, right? But according to the grammar, a `DATA` statement actually has only one argument, which is a list of numbers. That's why we get `args=[[8.0, 4.0]]`. \n",
+ "\n",
"Here are some more tests:"
]
},
{
"cell_type": "code",
- "execution_count": 21,
- "metadata": {},
+ "execution_count": 23,
+ "metadata": {
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
+ },
"outputs": [
{
"data": {
"text/plain": [
- "'ok'"
+ "True"
]
},
- "execution_count": 21,
+ "execution_count": 23,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
- "def test_exp(text, repr):\n",
- " \"Test that text can be parsed as an expression to yield repr, with no tokens left over.\"\n",
- " global tokens\n",
- " tokens = tokenizer(text)\n",
- " return (expression() == repr) and not tokens\n",
+ "def test_exp(text: str, repr) -> bool:\n",
+ " \"Test that text can be parsed as an expression to yield repr, with no TOKENS left over.\"\n",
+ " global TOKENS\n",
+ " TOKENS = tokenizer(text)\n",
+ " return (expression() == repr) and not TOKENS\n",
" \n",
- "def test_parser():\n",
+ "def test_parser() -> bool:\n",
+ " \"Tests of `expression` and other category functions.\"\n",
" assert is_funcname('SIN') and is_funcname('FNZ') # Function names are three letters\n",
- " assert not is_funcname('X') and not is_funcname('')\n",
+ " assert not is_funcname('X') and not is_funcname('') and not is_funcname('FN9')\n",
" assert is_varname('X') and is_varname('A2') # Variables names are one letter and an optional digit\n",
" assert not is_varname('FNZ') and not is_varname('A10') and not is_varname('')\n",
" assert is_relational('>') and is_relational('>=') and not is_relational('+')\n",
@@ -849,7 +870,7 @@
" assert test_exp('X--Y--Z', Opcall(Opcall('X', '-', Funcall('NEG', 'Y')), \n",
" '-', Funcall('NEG', 'Z')))\n",
" assert test_exp('((((X))))', 'X')\n",
- " return 'ok'\n",
+ " return True\n",
"\n",
"test_parser()"
]
@@ -865,68 +886,73 @@
},
{
"cell_type": "code",
- "execution_count": 22,
- "metadata": {
- "collapsed": true
- },
+ "execution_count": 24,
+ "metadata": {},
"outputs": [],
"source": [
- "def run(program): execute(parse(program))"
+ "def run(program: str) -> None: \n",
+ " \"\"\"Parse and execute the BASIC program.\"\"\"\n",
+ " execute(parse(program))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
- "The function `execute(stmts)` first calls `preprocess(stmts)` to handle *declarations*: `DATA` and `DEF` statements that are processed one time only, before the program runs, regardless of their line numbers. (`DIM` statements are also declarations, but I decided that all lists/tables can have any number of elements, so I can ignore `DIM` declarations.)\n",
- "`execute` keeps track of the state of the program, partially in three globals:\n",
+ "The function `execute(stmts)` first defines the runtime state of the program, which is partially held in three global variables (they are global because they are needed within `basic_print` or `evaluate`, and because the very nature of BASIC is to have global variables):\n",
"\n",
+ "* `column`: The column that `PRINT` will print in next.\n",
"* `variables`: A mapping of the values of all BASIC variables (both subscripted and unsubscripted).
For example, `{'P1': 3.14, ('M', (1, 1)): 42.0}` says that the value of `P1` is `3.14` and `M(1, 1)` is `42.0`.\n",
"* `functions`: A mapping of the values of all BASIC functions (both built-in and user-defined).
For example, `{'FNC': Function('X', Opcall('X', '^', 3)), 'SIN': math.sin}` \n",
- "* `column`: The column that `PRINT` will print in next.\n",
"\n",
- "And also with these local variables:\n",
+ "Program state is also held in these variables that are local to `execute`:\n",
"\n",
- "* `data`: a queue of all the numbers in `DATA` statements.\n",
- "* `pc`: program counter; the index into the list of statements.\n",
+ "* `data`: a queue of all the numbers in the program's `DATA` statements.\n",
+ "* `pc`: program counter; the index into the list of statements. Initially 0, the index of the first statement.\n",
"* `ret`: the index where a `RETURN` statement will return to.\n",
- "* `fors`: a map of `{varname: ForState(...)}` which gives the state of each `FOR` loop variable.\n",
- "* `goto`: a mapping of `{linenumber: index}`, for example `{10: 0, 20: 1}` for a program with two line numbers, 10 and 20.\n",
+ "* `fors`: a dict of `{varname: ForState(...)}` which gives the state of each `FOR` loop variable.\n",
+ "* `goto`: a dict of `{linenumber: index}`, for example `{10: 0, 20: 1}` for a program with two line numbers, 10 and 20.\n",
"\n",
"\n",
- "Running the program means executing the statement that the program counter (`pc`) is currently pointing at, repeatedly, until we hit an `END` or `STOP` statement (or a `READ` statement when there is no more data). \n",
- "The variable `pc` is initialized to `0` (the index of the first statement in the program) and is then incremented by `1` each cycle to go to the following statement; but a branching statement (`GOTO`, `IF`, `GOSUB`, or `RETURN`) can change the `pc` to something other than the following statement. Note that branching statements refer to line numbers, but the `pc` refers to the *index* number within the list of statements. The variable `goto` maps from line numbers to index numbers. In BASIC there is no notion of a *stack*, neither for variables nor return addresses. If I do a `GOSUB` to a subroutine that itself does a `GOSUB`, then the original return address is lost, because BASIC has only one return address register (which we call `ret`).\n",
+ "Running the program means executing the statement that the program counter (`pc`) is currently pointing at, repeatedly, until we either hit an `END` or `STOP` statement, attempt a `READ` when there is no more data, or fall off the end of the program. \n",
"\n",
- "The main body of `execute` checks the statement type, and takes appropriate action. All the statement types are straightforward, except for `FOR` and `NEXT`, which are explained a bit later."
+ "The variable `pc` is initialized to `0` and is then incremented by `1` each cycle to go to the following statement; but a branching statement (`GOTO`, `IF`, `GOSUB`, or `RETURN`) can change the `pc` to something other than the following statement. Note that branching statements refer to line numbers, but the `pc` refers to the *index* number within the list of statements (as does the `ForSTaate.continu` attribute). The variable `goto` maps from line numbers to index numbers. \n",
+ "\n",
+ "In BASIC there is no notion of a *stack*, neither for variables nor return addresses. If I do a `GOSUB` to a subroutine that itself does a `GOSUB`, then the original return address is lost, because BASIC has only one return address register (which we call `ret`).\n",
+ "\n",
+ "The main body of `execute` checks the statement type, and takes appropriate action. All the statement types are straightforward, except for `FOR` and `NEXT`, which are explained in the next section, and `PRINT` which is complicated by the need to keep track of columns."
]
},
{
"cell_type": "code",
- "execution_count": 23,
- "metadata": {
- "collapsed": true
- },
+ "execution_count": 25,
+ "metadata": {},
"outputs": [],
"source": [
- "def execute(stmts): \n",
- " \"Parse and execute the BASIC program.\"\n",
- " global variables, functions, column\n",
- " functions, data = preprocess(stmts) # {name: function,...}, deque[number,...]\n",
- " variables = defaultdict(float) # mapping of {variable: value}, default 0.0\n",
+ "ForState = namedtuple('ForState', 'continu, end, step') # Data for FOR loop \n",
+ "\n",
+ "def execute(stmts: List[Stmt]) -> None: \n",
+ " \"Execute the statements in the BASIC program.\"\n",
+ " global column, variables, functions, column\n",
" column = 0 # column to PRINT in next\n",
+ " variables = defaultdict(float) # mapping of {variable: value}, default 0.0\n",
+ " functions = {**builtins,\n",
+ " **{name: Function(x, body) \n",
+ " for (_, _, (name, x, body)) in statements_of_type('DEF', stmts)}}\n",
+ " \n",
+ " data = deque(append(s.args[0] for s in statements_of_type('DATA', stmts)))\n",
" pc = 0 # program counter\n",
" ret = 0 # index (pc) that a GOSUB returns to\n",
" fors = {} # runtime map of {varname: ForState(...)}\n",
- " goto = {stmt.num: i # map of {linenumber: index}\n",
- " for (i, stmt) in enumerate(stmts)}\n",
+ " goto = {stmt.num: i for (i, stmt) in enumerate(stmts)} # {line_number: index}\n",
" while pc < len(stmts):\n",
- " (_, typ, args) = stmts[pc] # Fetch and decode the instruction\n",
+ " (_, typ, args) = stmts[pc] # Fetch and decode the current statement\n",
" pc += 1 # Increment the program counter\n",
" if typ in ('END', 'STOP') or (typ == 'READ' and not data): \n",
" return\n",
" elif typ == 'LET':\n",
" V, exp = args\n",
- " let(V, evalu(exp))\n",
+ " let(V, evaluate(exp))\n",
" elif typ == 'READ':\n",
" for V in args[0]:\n",
" let(V, data.popleft())\n",
@@ -936,12 +962,17 @@
" pc = goto[args[0]]\n",
" elif typ == 'IF':\n",
" lhs, relational, rhs, dest = args\n",
- " if functions[relational](evalu(lhs), evalu(rhs)):\n",
+ " if functions[relational](evaluate(lhs), evaluate(rhs)):\n",
" pc = goto[dest]\n",
+ " elif typ == 'GOSUB':\n",
+ " ret = pc\n",
+ " pc = goto[args[0]]\n",
+ " elif typ == 'RETURN':\n",
+ " pc = ret\n",
" elif typ == 'FOR':\n",
" V, start, end, step = args\n",
- " variables[V] = evalu(start)\n",
- " fors[V] = ForState(pc, evalu(end), evalu(step))\n",
+ " variables[V] = evaluate(start)\n",
+ " fors[V] = ForState(pc, evaluate(end), evaluate(step))\n",
" elif typ == 'NEXT':\n",
" V = args[0]\n",
" continu, end, step = fors[V]\n",
@@ -949,11 +980,8 @@
" (step < 0 and variables[V] + step >= end)):\n",
" variables[V] += step\n",
" pc = continu\n",
- " elif typ == 'GOSUB':\n",
- " ret = pc\n",
- " pc = goto[args[0]]\n",
- " elif typ == 'RETURN':\n",
- " pc = ret"
+ " else:\n",
+ " assert typ in ('REM', 'DIM', 'DATA', 'DEF') # Ignore these at runtime"
]
},
{
@@ -965,52 +993,46 @@
},
{
"cell_type": "code",
- "execution_count": 24,
- "metadata": {
- "collapsed": true
- },
+ "execution_count": 26,
+ "metadata": {},
"outputs": [],
"source": [
"import math\n",
"import random\n",
"import operator as op\n",
"\n",
- "def preprocess(stmts):\n",
- " \"\"\"Go through stmts and return two values extracted from the declarations: \n",
- " functions: a mapping of {name: function}, for both built-in and user-defined functions,\n",
- " data: a queue of all the numbers in DATA statements.\"\"\"\n",
- " functions = { # A mapping of {name: function}; first the built-ins:\n",
- " 'SIN': math.sin, 'COS': math.cos, 'TAN': math.tan, 'ATN': math.atan, \n",
- " 'ABS': abs, 'EXP': math.exp, 'LOG': math.log, 'SQR': math.sqrt, 'INT': int,\n",
- " '>': op.gt, '<': op.lt, '=': op.eq, '>=': op.ge, '<=': op.le, '<>': op.ne, \n",
- " '^': pow, '+': op.add, '-': op.sub, '*': op.mul, '/': op.truediv, '%': op.mod,\n",
- " 'RND': lambda _: random.random(), 'NEG': op.neg}\n",
- " data = deque() # A queue of numbers that READ can read from\n",
- " for (_, typ, args) in stmts:\n",
- " if typ == 'DEF':\n",
- " name, parm, body = args\n",
- " functions[name] = Function(parm, body)\n",
- " elif typ == 'DATA':\n",
- " data.extend(args[0])\n",
- " return functions, data\n",
+ "builtins = { # A mapping of {name: function}\n",
+ " 'SIN': math.sin, 'COS': math.cos, 'TAN': math.tan, 'ATN': math.atan, \n",
+ " 'ABS': abs, 'EXP': math.exp, 'LOG': math.log, 'SQR': math.sqrt, 'INT': int,\n",
+ " '>': op.gt, '<': op.lt, '=': op.eq, '>=': op.ge, '<=': op.le, '<>': op.ne, \n",
+ " '^': pow, '+': op.add, '-': op.sub, '*': op.mul, '/': op.truediv, '%': op.mod,\n",
+ " 'RND': lambda _: random.random(), 'NEG': op.neg}\n",
"\n",
- "def evalu(exp):\n",
+ "def statements_of_type(typ, statements: List[Stmt]) -> Iterable[Stmt]:\n",
+ " \"\"\"All statements of the given type.\"\"\"\n",
+ " return (stmt for stmt in statements if stmt.typ == typ)\n",
+ "\n",
+ "def append(list_of_lists) -> list:\n",
+ " \"\"\"Append together a list of lists.\"\"\"\n",
+ " return sum(list_of_lists, [])\n",
+ "\n",
+ "def evaluate(exp) -> float:\n",
" \"Evaluate an expression, returning a number.\"\n",
- " if isinstance(exp, Opcall):\n",
- " return functions[exp.op](evalu(exp.x), evalu(exp.y))\n",
- " elif isinstance(exp, Funcall):\n",
- " return functions[exp.f](evalu(exp.x))\n",
- " elif isinstance(exp, Subscript):\n",
- " return variables[exp.var, tuple(evalu(x) for x in exp.indexes)]\n",
- " elif is_varname(exp):\n",
+ " if isa(exp, Opcall):\n",
+ " return functions[exp.op](evaluate(exp.x), evaluate(exp.y))\n",
+ " elif isa(exp, Funcall):\n",
+ " return functions[exp.function](evaluate(exp.arg))\n",
+ " elif isa(exp, Subscript):\n",
+ " return variables[(exp.var, *(evaluate(x) for x in exp.indexes))]\n",
+ " elif isa(exp, Varname):\n",
" return variables[exp]\n",
- " else: # number constant\n",
+ " else: # Must be a number, which evaluates to itself\n",
" return exp\n",
- " \n",
- "def let(V, value):\n",
+ " \n",
+ "def let(V: Union[Variable, Subscript], value: float):\n",
" \"Assign value to the variable name or Subscripted variable.\"\n",
- " if isinstance(V, Subscript): # A subsscripted variable\n",
- " variables[V.var, tuple(evalu(x) for x in V.indexes)] = value \n",
+ " if isa(V, Subscript): # A subsscripted variable\n",
+ " variables[(V.var, *(evaluate(x) for x in V.indexes))] = value \n",
" else: # An unsubscripted variable\n",
" variables[V] = value"
]
@@ -1021,33 +1043,43 @@
"source": [
"# Execution: `FOR/NEXT` Statements\n",
"\n",
- "I have to admit I don't completely understand `FOR` loops. My questions include:\n",
+ "Some aspects of `FOR` loops are unclear. I have questions!\n",
"\n",
"* Are the `END` and `STEP` expressions evaluated once when we first enter the `FOR` loop, or each time through the loop?\n",
- "* After executing `\"FOR V = 1 TO 10\"`, is the value of `V` equal to 10 or 11? (Answer: the manual says 10.)\n",
- "* Does `\"FOR V = 0 TO -2\"` execute zero times? Or do all loops execute at least once, with the termination test done by the `NEXT`?\n",
- "* What if control branches into the middle of a loop and hits the `NEXT` statement, without ever executing the corresponding `FOR` statement? \n",
- "* What if control branches into the middle of a loop and hits the `NEXT` statement, without ever executing the corresponding `FOR` statement, but we have previously\n",
- "executed a `FOR` statement of a *different* loop that happens to use the same variable name?\n",
+ " - I choose: only once. (Like Python, but unlike C.)\n",
+ "* After executing `\"FOR V = 1 TO 10\"`, is the value of `V` equal to 10 or 11? \n",
+ " - Upon further review, [the manual](https://web.archive.org/web/20120716185629/http://www.bitsavers.org/pdf/dartmouth/BASIC_Oct64.pdf) says 10.\n",
+ "* Does `\"FOR V = 2 TO 1\"` execute zero times? Or do all loops execute at least once, with the termination test done by the `NEXT`?\n",
+ " - All loops execute at least once. (Online BASIC interpreters agree.)\n",
+ "* What if there are two different `FOR V` loops, and a `GOTO` branches from the middle of one to the middle of the other?\n",
+ " - BASIC doesn't really have a notion of a scope for a loop, there is just a sequence of statements, all in the same scope, and all variables are global to the program. Any `NEXT V` statement always branches to the most recently executed `FOR V`, regardless of the layout of the code. (Online BASIC interpreters agree.) \n",
+ "* What if we execute a `NEXT V` statement before ever executing the `FOR V` statement? \n",
+ " - This raises an error. (I asked [Dennis Allison](https://en.wikipedia.org/wiki/Dennis_Allison), the developer of [Tiny Basic](https://en.wikipedia.org/wiki/Tiny_BASIC), and he said this case is undefined, so whatever I choose is okay.)\n",
+ " \n",
+ "Consider this program:\n",
"\n",
- "I chose a solution that is easy to implement, and correctly runs all the examples in the manual, but I'm not certain that my solution is true to the original intent. Consider this program:\n",
- "\n",
- " 10 PRINT \"TABLE OF SQUARES\"\n",
- " 20 LET N = 10\n",
- " 30 FOR V = 1 to N STEP N/5\n",
- " 40 PRINT V, V * V\n",
- " 50 NEXT V\n",
- " 60 END\n",
+ " 0 PRINT \"TABLE OF SQUARES\"\n",
+ " 10 LET N = 10\n",
+ " 20 FOR V = 1 to N STEP N/5\n",
+ " 30 PRINT V, V * V\n",
+ " 40 NEXT V\n",
+ " 50 END\n",
" \n",
" \n",
- "* When control hits the `\"FOR V\"` statement in line 30, I assign:\n",
- "
`variables['V'] = 1`\n",
- "
` fors['V'] = ForState(continu=3, end=10, step=2)`\n",
- "
where `3` is the index of line 40 (the line right after the `FOR` statement); `10` is the value of `N`; and `2` is the value of `N/5`.\n",
- "* When control hits the `\"NEXT V\"` statement in line 50, I do the following:\n",
- "
Examine `fors['V']` to check if `V` incremented by the step value, `2`, is within the bounds defined by the end, `10`. \n",
- "
If it is, increment `V` and assign `pc` to be `3`, the `continu` value. \n",
- "
If not, continue on to the following statement, `60`."
+ "When control hits the `\"FOR V\"` statement in line 20, I make these assignments in the interpreter:\n",
+ "\n",
+ " variables['V'] = 1\n",
+ " fors['V'] = ForState(continu=3, end=10, step=2)\n",
+ " \n",
+ "where `3` is the index of the line right after the `FOR` statement; `10` is the value of `N`; and `2` is the value of `N/5`. A `ForState` is an object that is used during the dynamic execution of the program (one per `FOR` loop variable), but does not occur in the static representation of the program.\n",
+ "\n",
+ "When control hits the `\"NEXT V\"` statement, do the following:\n",
+ "- Retrieve `fors['V']`\n",
+ "- Check if `V` incremented by the step value, `2`, is within the bounds defined by the end, `10`. \n",
+ "- If it is, increment `V` and assign the program counter `pc` to be `3`, the `continu` line value. \n",
+ "- If not, continue on to the following statement, line `50`.\n",
+ "\n",
+ "My implementation might not be officially correct in every detail, but it correctly runs all the examples in [the manual](https://web.archive.org/web/20120716185629/http://www.bitsavers.org/pdf/dartmouth/BASIC_Oct64.pdf). "
]
},
{
@@ -1061,30 +1093,34 @@
},
{
"cell_type": "code",
- "execution_count": 25,
+ "execution_count": 27,
"metadata": {
- "collapsed": true
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
},
"outputs": [],
"source": [
- "def basic_print(items): \n",
- " \"Print the items (',' / ';' / label / expression) in appropriate columns.\"\n",
+ "def basic_print(items) -> None: \n",
+ " \"\"\"Print the items (',' | ';' | \"label\" | expression) in appropriate columns.\"\"\"\n",
" for item in items:\n",
" if item == ',': pad(15)\n",
" elif item == ';': pad(3)\n",
" elif is_label(item): print_string(item.replace('\"', ''))\n",
- " else: print_string(\"{:g} \".format(evalu(item)))\n",
+ " else: print_string(f\"{evaluate(item):g} \")\n",
" if (not items) or items[-1] not in (',', ';'):\n",
" newline()\n",
" \n",
- "def print_string(s): \n",
+ "def print_string(s) -> None: \n",
" \"Print a string, keeping track of column, and advancing to newline if at or beyond column 100.\"\n",
" global column\n",
" print(s, end='')\n",
" column += len(s)\n",
- " if column >= 100: newline()\n",
+ " if column >= 100: \n",
+ " newline()\n",
" \n",
- "def pad(width): \n",
+ "def pad(width) -> None: \n",
" \"Pad out to the column that is the next multiple of width.\"\n",
" while column % width != 0: \n",
" print_string(' ')\n",
@@ -1103,8 +1139,13 @@
},
{
"cell_type": "code",
- "execution_count": 26,
- "metadata": {},
+ "execution_count": 28,
+ "metadata": {
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
+ },
"outputs": [
{
"name": "stdout",
@@ -1116,7 +1157,7 @@
"15 READ N0, P0\n",
"20 PRINT \"N\",\n",
"25 FOR P = 2 to P0\n",
- "30 PRINT \"N ^\" P,\n",
+ "30 PRINT \"N^\" P,\n",
"35 NEXT P\n",
"40 PRINT \"SUM\"\n",
"45 LET S = 0\n",
@@ -1139,9 +1180,12 @@
},
{
"cell_type": "code",
- "execution_count": 27,
+ "execution_count": 29,
"metadata": {
- "scrolled": true
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
},
"outputs": [
{
@@ -1152,7 +1196,7 @@
" Stmt(num=15, typ='READ', args=[['N0', 'P0']]),\n",
" Stmt(num=20, typ='PRINT', args=[['\"N\"', ',']]),\n",
" Stmt(num=25, typ='FOR', args=['P', 2.0, 'P0', 1]),\n",
- " Stmt(num=30, typ='PRINT', args=[['\"N ^\"', 'P', ',']]),\n",
+ " Stmt(num=30, typ='PRINT', args=[['\"N^\"', 'P', ',']]),\n",
" Stmt(num=35, typ='NEXT', args=['P']),\n",
" Stmt(num=40, typ='PRINT', args=[['\"SUM\"']]),\n",
" Stmt(num=45, typ='LET', args=['S', 0.0]),\n",
@@ -1167,7 +1211,7 @@
" Stmt(num=99, typ='END', args=[])]"
]
},
- "execution_count": 27,
+ "execution_count": 29,
"metadata": {},
"output_type": "execute_result"
}
@@ -1178,14 +1222,14 @@
},
{
"cell_type": "code",
- "execution_count": 28,
+ "execution_count": 30,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
- "N N ^2 N ^3 N ^4 SUM\n",
+ "N N^2 N^3 N^4 SUM\n",
"2 4 8 16 28 \n",
"3 9 27 81 145 \n",
"4 16 64 256 481 \n",
@@ -1213,8 +1257,13 @@
},
{
"cell_type": "code",
- "execution_count": 29,
- "metadata": {},
+ "execution_count": 31,
+ "metadata": {
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
+ },
"outputs": [
{
"name": "stdout",
@@ -1251,8 +1300,13 @@
},
{
"cell_type": "code",
- "execution_count": 30,
- "metadata": {},
+ "execution_count": 32,
+ "metadata": {
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
+ },
"outputs": [
{
"name": "stdout",
@@ -1289,8 +1343,13 @@
},
{
"cell_type": "code",
- "execution_count": 31,
- "metadata": {},
+ "execution_count": 33,
+ "metadata": {
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
+ },
"outputs": [
{
"name": "stdout",
@@ -1315,8 +1374,13 @@
},
{
"cell_type": "code",
- "execution_count": 32,
- "metadata": {},
+ "execution_count": 34,
+ "metadata": {
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
+ },
"outputs": [
{
"name": "stdout",
@@ -1374,28 +1438,62 @@
},
{
"cell_type": "code",
- "execution_count": 33,
+ "execution_count": 35,
+ "metadata": {
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "1 8 27 64 125 216 343 512 729 1000 1331 1728 2197 2744 3375 4096 4913 5832 6859 \n",
+ "8000 9261 10648 12167 13824 15625 17576 19683 21952 24389 27000 29791 32768 35937 39304 42875 46656 \n",
+ "50653 54872 59319 64000 68921 74088 79507 85184 91125 97336 103823 110592 117649 125000 132651 \n",
+ "140608 148877 157464 166375 175616 185193 195112 205379 216000 226981 238328 250047 \n",
+ "262144 274625 287496 300763 314432 328509 343000 357911 373248 389017 405224 421875 \n",
+ "438976 456533 474552 493039 512000 531441 551368 571787 592704 614125 636056 658503 \n",
+ "681472 704969 729000 753571 778688 804357 830584 857375 884736 912673 941192 970299 \n",
+ "1e+06 "
+ ]
+ }
+ ],
+ "source": [
+ "# Cubes (page 35)\n",
+ "\n",
+ "run('''\n",
+ "10 FOR I = 1 TO 100\n",
+ "20 PRINT I*I*I;\n",
+ "30 NEXT I\n",
+ "40 END\n",
+ "''')"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 36,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
- "1e+06 941192 884736 830584 778688 729000 681472 636056 592704 551368 512000 474552 \n",
- "438976 405224 373248 343000 314432 287496 262144 238328 216000 195112 175616 157464 \n",
- "140608 125000 110592 97336 85184 74088 64000 54872 46656 39304 32768 27000 21952 17576 13824 10648 \n",
- "8000 5832 4096 2744 1728 1000 512 216 64 8 0 "
+ "10000 9604 9216 8836 8464 8100 7744 7396 7056 6724 6400 6084 5776 5476 5184 4900 4624 \n",
+ "4356 4096 3844 3600 3364 3136 2916 2704 2500 2304 2116 1936 1764 1600 1444 1296 1156 \n",
+ "1024 900 784 676 576 484 400 324 256 196 144 100 64 36 16 4 0 "
]
}
],
"source": [
- "# Cubes (page 35; but with STEP -2 because I haven't tested negative step yet)\n",
+ "# Squares (variant of page 35 program, designed to test negative STEP)\n",
"\n",
"run('''\n",
"10 FOR I = 100 TO 0 STEP -2\n",
- "20 PRINT I*I*I;\n",
+ "20 PRINT I*I;\n",
"30 NEXT I\n",
- "40 END\n",
"''')\n",
"\n",
"assert variables['I'] == 0"
@@ -1403,8 +1501,13 @@
},
{
"cell_type": "code",
- "execution_count": 34,
- "metadata": {},
+ "execution_count": 37,
+ "metadata": {
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
+ },
"outputs": [
{
"name": "stdout",
@@ -1455,37 +1558,42 @@
},
{
"cell_type": "code",
- "execution_count": 35,
- "metadata": {},
+ "execution_count": 38,
+ "metadata": {
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
+ },
"outputs": [
{
"data": {
"text/plain": [
"defaultdict(float,\n",
- " {'J': 5.0,\n",
- " ('S', (3.0, 1.0)): 35.0,\n",
- " ('S', (3.0, 4.0)): 16.0,\n",
- " ('S', (3.0, 5.0)): 33.0,\n",
- " ('S', (1.0, 2.0)): 20.0,\n",
- " ('S', (1.0, 3.0)): 37.0,\n",
- " ('S', (2.0, 3.0)): 3.0,\n",
- " ('S', (2.0, 2.0)): 16.0,\n",
- " ('S', (1.0, 5.0)): 42.0,\n",
- " ('P', (1.0,)): 1.25,\n",
- " ('S', (3.0, 3.0)): 29.0,\n",
- " ('S', (2.0, 4.0)): 21.0,\n",
- " 'S': 169.4,\n",
- " 'I': 3.0,\n",
- " ('P', (2.0,)): 4.3,\n",
- " ('S', (3.0, 2.0)): 47.0,\n",
- " ('S', (1.0, 1.0)): 40.0,\n",
- " ('S', (1.0, 4.0)): 29.0,\n",
- " ('S', (2.0, 5.0)): 8.0,\n",
- " ('P', (3.0,)): 2.5,\n",
- " ('S', (2.0, 1.0)): 10.0})"
+ " {'I': 3.0,\n",
+ " ('P', 1.0): 1.25,\n",
+ " ('P', 2.0): 4.3,\n",
+ " ('P', 3.0): 2.5,\n",
+ " 'J': 5.0,\n",
+ " ('S', 1.0, 1.0): 40.0,\n",
+ " ('S', 1.0, 2.0): 20.0,\n",
+ " ('S', 1.0, 3.0): 37.0,\n",
+ " ('S', 1.0, 4.0): 29.0,\n",
+ " ('S', 1.0, 5.0): 42.0,\n",
+ " ('S', 2.0, 1.0): 10.0,\n",
+ " ('S', 2.0, 2.0): 16.0,\n",
+ " ('S', 2.0, 3.0): 3.0,\n",
+ " ('S', 2.0, 4.0): 21.0,\n",
+ " ('S', 2.0, 5.0): 8.0,\n",
+ " ('S', 3.0, 1.0): 35.0,\n",
+ " ('S', 3.0, 2.0): 47.0,\n",
+ " ('S', 3.0, 3.0): 29.0,\n",
+ " ('S', 3.0, 4.0): 16.0,\n",
+ " ('S', 3.0, 5.0): 33.0,\n",
+ " 'S': 169.4})"
]
},
- "execution_count": 35,
+ "execution_count": 38,
"metadata": {},
"output_type": "execute_result"
}
@@ -1496,16 +1604,21 @@
},
{
"cell_type": "code",
- "execution_count": 36,
- "metadata": {},
+ "execution_count": 39,
+ "metadata": {
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
+ },
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
- "1 2 4 9 5 0 5 3 7 7 3 8 6 4 4 6 7 4 5 4 8 8 7 9 4 1 0 3 5 2 3 4 5 3 \n",
- "6 5 3 1 0 9 5 6 1 4 5 7 3 1 4 3 6 3 7 2 3 0 2 2 7 5 0 8 7 9 3 9 5 7 \n",
- "5 0 1 9 6 3 7 5 0 0 5 7 3 5 9 3 2 6 1 2 1 9 1 7 0 9 0 6 9 6 7 2 "
+ "8 5 6 5 2 4 8 4 6 0 8 3 5 8 8 4 7 7 0 3 3 7 9 0 5 5 2 8 5 7 2 3 9 1 \n",
+ "9 3 1 7 7 6 1 4 7 7 5 7 0 9 9 3 0 8 0 4 2 4 9 4 1 7 8 8 6 9 0 7 7 5 \n",
+ "4 7 7 6 0 7 0 9 8 0 8 7 3 5 9 7 0 1 3 2 5 5 8 7 0 0 3 0 2 1 8 8 "
]
}
],
@@ -1522,8 +1635,13 @@
},
{
"cell_type": "code",
- "execution_count": 37,
- "metadata": {},
+ "execution_count": 40,
+ "metadata": {
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
+ },
"outputs": [
{
"name": "stdout",
@@ -1536,7 +1654,7 @@
"45 0.707107 0.707107 1 \n",
"60 0.866025 0.5 1 \n",
"75 0.965926 0.258819 1 \n",
- "90 1 6.12323e-17 1 \n"
+ "90 1 1.61554e-15 1 \n"
]
}
],
@@ -1545,20 +1663,25 @@
"\n",
"run('''\n",
" 5 PRINT \"D\"; \"SIN(D)\", \"COS(D)\", \"SIN(D)^2 + COS(D)^2\"\n",
- "20 LET P = 3.1415926535897932 / 180\n",
+ "20 LET P1 = 3.14159265358979 / 180\n",
"30 FOR X = 0 TO 90 STEP 15\n",
"40 PRINT X; FNS(X), FNC(X), FNS(X)^2 + FNC(X)^2\n",
"50 NEXT X\n",
- "97 DEF FNS(D) = SIN(D * P)\n",
- "98 DEF FNC(D) = COS(D * P)\n",
+ "97 DEF FNS(D) = SIN(D * P1)\n",
+ "98 DEF FNC(D) = COS(D * P1)\n",
"99 END\n",
"''')"
]
},
{
"cell_type": "code",
- "execution_count": 38,
- "metadata": {},
+ "execution_count": 41,
+ "metadata": {
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
+ },
"outputs": [
{
"name": "stdout",
@@ -1571,7 +1694,7 @@
}
],
"source": [
- "# GOSUB (page 43)\n",
+ "# GOSUB example (page 43)\n",
"\n",
"run('''\n",
"100 LET X = 3\n",
@@ -1593,14 +1716,19 @@
},
{
"cell_type": "code",
- "execution_count": 39,
- "metadata": {},
+ "execution_count": 42,
+ "metadata": {
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
+ },
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
- "21 \n"
+ "SUM = 21 \n"
]
}
],
@@ -1614,21 +1742,26 @@
"30 IF N >= 20 THEN 60\n",
"40 LET N = N + 1\n",
"50 GOTO 20\n",
- "60 PRINT S\n",
+ "60 PRINT \"SUM = \" S\n",
"70 END\n",
"''')"
]
},
{
"cell_type": "code",
- "execution_count": 40,
- "metadata": {},
+ "execution_count": 43,
+ "metadata": {
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
+ },
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
- "21 \n"
+ "SUM = 21 \n"
]
}
],
@@ -1637,7 +1770,7 @@
"20 FOR N = 1 TO 20\n",
"40 LET S = S + N/10\n",
"50 NEXT N\n",
- "60 PRINT S\n",
+ "60 PRINT \"SUM = \" S\n",
"70 END\n",
"''')\n",
"\n",
@@ -1655,8 +1788,13 @@
},
{
"cell_type": "code",
- "execution_count": 41,
- "metadata": {},
+ "execution_count": 44,
+ "metadata": {
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
+ },
"outputs": [
{
"name": "stdout",
@@ -1671,19 +1809,19 @@
"Error in line '7 LET Z = +3' at '+ 3': unknown expression\n",
"Error in line '8 LET X = Y ** 2' at '* 2': unknown expression\n",
"Error in line '9 LET A(I = 1' at '= 1': expected \")\" to close subscript\n",
- "Error in line '10 IF A = 0 THEN 900 + 99' at '+ 99': extra tokens at end of line\n",
- "Error in line '11 NEXT A(I)' at '( I )': extra tokens at end of line\n",
+ "Error in line '10 IF A = 0 THEN 900 + 99' at '+ 99': extra TOKENS at end of line\n",
+ "Error in line '11 NEXT A(I)' at '( I )': extra TOKENS at end of line\n",
"Error in line '12 DEF F(X) = X ^ 2 + 1' at 'F ( X ) = X ^ 2 + 1': expected a function name\n",
"Error in line '13 IF X != 0 THEN 999' at '! = 0 THEN 999': expected a relational operator\n",
"Error in line '14 DEF FNS(X + 2*P1) = SIN(X)' at '+ 2 * P1 ) = SIN ( X )': expected ')'\n",
"Error in line '15 DEF FNY(M, B) = M * X + B' at ', B ) = M * X + B': expected ')'\n",
"Error in line '16 LET 3 = X' at '3 = X': expected a variable name\n",
"Error in line '17 LET SIN = 7 * DEADLY' at 'SIN = 7 * D E A D L Y': expected a variable name\n",
- "Error in line '18 LET X = A-1(I)' at '( I )': extra tokens at end of line\n",
+ "Error in line '18 LET X = A-1(I)' at '( I )': extra TOKENS at end of line\n",
"Error in line '19 FOR SCORE + 7' at 'C O R E + 7': expected '='\n",
- "Error in line '20 STOP IN NAME(LOVE)' at 'I N N A M E ( L O V E )': extra tokens at end of line\n",
- "Error in line '85 ENDURANCE.' at 'U R A N C E .': extra tokens at end of line\n",
- "ADD 2 + 2 = 4 \n"
+ "Error in line '20 STOP IN NAME(LOVE)' at 'I N N A M E ( L O V E )': extra TOKENS at end of line\n",
+ "Error in line '26 ENDURANCE.' at 'U R A N C E .': extra TOKENS at end of line\n",
+ "NO ERROR IN LINE 27 DESPITE THE ERRORS ABOVE\n"
]
}
],
@@ -1709,14 +1847,13 @@
"18 LET X = A-1(I)\n",
"19 FOR SCORE + 7\n",
"20 STOP IN NAME(LOVE)\n",
- "80 REMARKABLY, THE INTERPRETER\n",
- "81 REMEDIES THE ERRORS, AND THE PROPGRAM\n",
- "82 REMAINS AN EXECUTABLE ENTITY, UN-\n",
- "83 REMITTENTLY RUNNING, WITH NO\n",
- "84 REMORSE OR REGRETS, AND WITH GREAT\n",
- "85 ENDURANCE.\n",
- "98 PRINT \"ADD 2 + 2 = \" 2 + 2\n",
- "99 END\n",
+ "21 REMARKABLY, THE INTERPRETER\n",
+ "22 REMEDIES THE ERRORS, AND THE PROPGRAM\n",
+ "23 REMAINS AN EXECUTABLE ENTITY, UN-\n",
+ "24 REMITTENTLY RUNNING, WITH NO\n",
+ "25 REMORSE OR REGRETS, AND WITH GREAT\n",
+ "26 ENDURANCE.\n",
+ "27 PRINT \"No error in line \" 3 ^ 3 \"despite the errors above\"\n",
"''')"
]
},
@@ -1731,8 +1868,13 @@
},
{
"cell_type": "code",
- "execution_count": 42,
- "metadata": {},
+ "execution_count": 45,
+ "metadata": {
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
+ },
"outputs": [
{
"name": "stdout",
@@ -1925,13 +2067,18 @@
"cell_type": "markdown",
"metadata": {},
"source": [
- "Actually, this was an assignment in my high school BASIC class. (We used a [slightly different](https://www.grc.com/pdp-8/docs/os8_basic_reference.pdf) version of BASIC.) Back then, output was on rolls of paper, and I thought it was wasteful to print only one generation per line. So I arranged to print multiple generations on the same line, storing them until it was time to print them out. But BASIC doesn't have three-dimensional arrays, so I needed to store several generations worth of data in one `A(X, Y)` value. Today, I know that that could be done by allocating one bit for each generation, but back then I don't think I knew about binary representation, so I stored one generation in each decimal digit. That means I no longer need two matrixes, `A` and `B`; instead, the current generation will always be the value in the one's place, the previous generation in the ten's place, and the one before that in the hundred's place. (Also, I admit I cheated: I added the mod operatoir, `%`, which did not appear in early versions of BASIC, just because it was useful for this program.)"
+ "Actually, this was an assignment in my high school BASIC class. (We used a [slightly different](https://www.grc.com/pdp-8/docs/os8_basic_reference.pdf) version of BASIC.) Back then, output was on [rolls of paper](https://en.wikipedia.org/wiki/Teletype_Corporation#/media/File:Teletype_with_papertape_punch_and_reader.jpg), and I thought it was wasteful to print only one generation per line. So I arranged to print multiple generations on the same line, storing them until it was time to print them out. But BASIC doesn't have three-dimensional arrays, so I needed to store several generations worth of data in one `A(X, Y)` value. Today, I know that that could be done by allocating one bit for each generation, but back then I don't think I knew about binary representation, so I stored one generation in each decimal digit. That means I no longer need two matrixes, `A` and `B`; instead, the current generation will always be the value in the one's place, the previous generation in the ten's place, and the one before that in the hundred's place."
]
},
{
"cell_type": "code",
- "execution_count": 43,
- "metadata": {},
+ "execution_count": 46,
+ "metadata": {
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
+ },
"outputs": [
{
"name": "stdout",
@@ -2065,7 +2212,7 @@
],
"metadata": {
"kernelspec": {
- "display_name": "Python 3",
+ "display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
@@ -2079,9 +2226,9 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
- "version": "3.5.3"
+ "version": "3.8.15"
}
},
"nbformat": 4,
- "nbformat_minor": 1
+ "nbformat_minor": 4
}