From 171486979aa4facf7692d29a8576ae8fbf26b884 Mon Sep 17 00:00:00 2001 From: Peter Norvig Date: Thu, 7 Mar 2019 16:04:51 -0800 Subject: [PATCH] Add files via upload --- ipynb/Cryptarithmetic.ipynb | 112 +++++++++++++++++++++++------------- 1 file changed, 73 insertions(+), 39 deletions(-) diff --git a/ipynb/Cryptarithmetic.ipynb b/ipynb/Cryptarithmetic.ipynb index 7f3b14f..d3b1632 100644 --- a/ipynb/Cryptarithmetic.ipynb +++ b/ipynb/Cryptarithmetic.ipynb @@ -4,7 +4,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "
Peter Norvig 2014
\n", + "
Peter Norvig 2014
\n", "\n", "# Cryptarithmetic (Alphametic) Problems\n", "\n", @@ -25,10 +25,11 @@ "6. Should I find one solution (faster) or all solutions (more complete)? I'll handle both use cases by \n", "implementing my function `solve` to return an iterator, which yields solutions one at a time; you can get the first one with `next` or all of them with `set`. \n", "\n", - "## The solution: `solve`\n", + "# The solution: `solve`\n", "\n", "Below we see that `solve` works by generating every way to replace letters in the formula with numbers,\n", - "and then filtering them to keep only valid strings (ones that evaluate to true and have no leading zero)." + "and then filtering them to keep only valid strings (ones that evaluate to true and have no leading zero).\n", + "The `str.translate` method is used to do the replacements." ] }, { @@ -43,22 +44,25 @@ "def solve(formula):\n", " \"\"\"Given a formula like 'NUM + BER = PLAY', fill in digits to solve it.\n", " Generate all valid digit-filled-in strings.\"\"\"\n", - " return filter(valid, replace_letters(formula.replace(' = ', ' == ')))\n", + " return filter(valid, letter_replacements(formula))\n", "\n", - "def replace_letters(formula):\n", - " \"\"\"Generate all possible replacements of letters with digits in formula.\"\"\"\n", - " letters = ''.join(set(re.findall('[A-Z]', formula)))\n", + "def letter_replacements(formula):\n", + " \"\"\"All possible replacements of letters with digits in formula.\"\"\"\n", + " formula = formula.replace(' = ', ' == ') # Allow = or ==\n", + " letters = cat(set(re.findall('[A-Z]', formula)))\n", " for digits in itertools.permutations('1234567890', len(letters)):\n", - " yield formula.translate(str.maketrans(letters, ''.join(digits)))\n", + " yield formula.translate(str.maketrans(letters, cat(digits)))\n", "\n", "def valid(exp):\n", " \"\"\"Expression is valid iff it has no leading zero, and evaluates to true.\"\"\"\n", " try:\n", - " return not leading_zero(exp) and eval(exp)\n", + " return not leading_zero(exp) and eval(exp) is True\n", " except ArithmeticError:\n", " return False\n", " \n", - "leading_zero = re.compile(r'\\b0[0-9]').search" + "cat = ''.join # Function to concatenate strings\n", + " \n", + "leading_zero = re.compile(r'\\b0[0-9]').search # Function to check for illegal number" ] }, { @@ -69,7 +73,7 @@ { "data": { "text/plain": [ - "'746 + 289 == 1035'" + "'489 + 537 == 1026'" ] }, "execution_count": 2, @@ -90,8 +94,8 @@ "name": "stdout", "output_type": "stream", "text": [ - "CPU times: user 17 s, sys: 68.2 ms, total: 17.1 s\n", - "Wall time: 17.2 s\n" + "CPU times: user 17.2 s, sys: 61.3 ms, total: 17.3 s\n", + "Wall time: 17.4 s\n" ] }, { @@ -113,29 +117,52 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "So there are 96 solutions, but `solve` is a bit slow to find them all. How could we make `solve` faster?\n", + "# A faster solution: `faster_solve`\n", "\n", - "## A faster solution: `faster_solve`\n", - "\n", - "I used `%prun` to get profiling results, and saw that 2/3 of the time is spent in `eval`. So let's eliminate the calls to `eval`. That should be doable, because the expression we are evaluating is basically the same each time, but with different permutations of digits filled in. We could save a lot of work if we convert the expression into a Python function, compile that function just once, and then call the function for each permutation of digits. In other words, we want to take an expression such as\n", + "Depending on your computer, that probably took 15 or 20 seconds. Can we make it faster? To answer the question, I start by profiling to see where the time is spent. I can use the magic function `%prun` to profile:" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + " " + ] + } + ], + "source": [ + "%prun next(solve('NUM + BER = PLAY'))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We see that about 2/3 of the time is spent in `eval`. So let's eliminate the calls to `eval`. That should be doable, because the expression we are evaluating is basically the same each time, but with different permutations of digits filled in. We could save a lot of work if we convert the expression into a Python function, compile that function just once, and then call the function for each of the 3.6 million permutations of digits. We want to take an expression such as:\n", "\n", " \"NUM + BER == PLAY\"\n", " \n", - "and transform it into the Python function\n", + "and transform it into the Python function:\n", "\n", - "\n", - " lambda A,B,E,L,M,N,P,R,U,Y: (100*N+10*U+M) + (100*B+10*E+R) == (1000*P+100*L+10*A+Y)\n", + " (lambda A,B,E,L,M,N,P,R,U,Y: \n", + " (100*N+10*U+M) + (100*B+10*E+R) == (1000*P+100*L+10*A+Y))\n", " \n", "Actually that's not quite right. The rules say that \"N\", \"B\", and \"P\" cannot be zero. So the function should be:\n", "\n", - " A,B,E,L,M,N,P,R,U,Y: B and N and P and ((100*N+10*U+M) + (100*B+10*E+R) == (1000*P+100*L+10*A+Y))\n", + " (lambda A,B,E,L,M,N,P,R,U,Y: \n", + " B and N and P and ((100*N+10*U+M) + (100*B+10*E+R) == (1000*P+100*L+10*A+Y)))\n", "\n", "Here is the code to compile a formula into a Python function: " ] }, { "cell_type": "code", - "execution_count": 4, + "execution_count": 5, "metadata": {}, "outputs": [], "source": [ @@ -144,7 +171,7 @@ " in same order as parms of function. For example, 'YOU == ME**2' returns\n", " (lambda E,M,O,U,Y: M and Y and ((100*Y+10*O+U) == (10*M+E)**2), 'YMEUO'\"\"\"\n", " formula = formula.replace(' = ', ' == ')\n", - " letters = ''.join(sorted(set(re.findall('[A-Z]', formula))))\n", + " letters = cat(sorted(set(re.findall('[A-Z]', formula))))\n", " firstletters = sorted(set(re.findall(r'\\b([A-Z])[A-Z]', formula)))\n", " body = re.sub('[A-Z]+', compile_word, formula)\n", " body = ' and '.join(firstletters + [body])\n", @@ -164,7 +191,7 @@ }, { "cell_type": "code", - "execution_count": 5, + "execution_count": 6, "metadata": {}, "outputs": [ { @@ -180,7 +207,7 @@ "((E, M, O, U, Y)>, 'EMOUY')" ] }, - "execution_count": 5, + "execution_count": 6, "metadata": {}, "output_type": "execute_result" } @@ -191,7 +218,7 @@ }, { "cell_type": "code", - "execution_count": 6, + "execution_count": 7, "metadata": {}, "outputs": [ { @@ -207,7 +234,7 @@ "((A, B, E, L, M, N, P, R, U, Y)>, 'ABELMNPRUY')" ] }, - "execution_count": 6, + "execution_count": 7, "metadata": {}, "output_type": "execute_result" } @@ -216,9 +243,16 @@ "compile_formula(\"NUM + BER = PLAY\", verbose=True)" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now we're ready for the faster version of `solve`:" + ] + }, { "cell_type": "code", - "execution_count": 7, + "execution_count": 8, "metadata": {}, "outputs": [], "source": [ @@ -229,14 +263,14 @@ " for digits in itertools.permutations((1,2,3,4,5,6,7,8,9,0), len(letters)):\n", " try:\n", " if fn(*digits):\n", - " yield formula.translate(str.maketrans(letters, ''.join(map(str, digits))))\n", + " yield formula.translate(str.maketrans(letters, cat(map(str, digits))))\n", " except ArithmeticError: \n", " pass" ] }, { "cell_type": "code", - "execution_count": 8, + "execution_count": 9, "metadata": {}, "outputs": [ { @@ -245,7 +279,7 @@ "'587 + 439 = 1026'" ] }, - "execution_count": 8, + "execution_count": 9, "metadata": {}, "output_type": "execute_result" } @@ -256,15 +290,15 @@ }, { "cell_type": "code", - "execution_count": 9, + "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ - "CPU times: user 1.3 s, sys: 3.96 ms, total: 1.3 s\n", - "Wall time: 1.3 s\n" + "CPU times: user 1.32 s, sys: 8.41 ms, total: 1.33 s\n", + "Wall time: 1.33 s\n" ] }, { @@ -273,7 +307,7 @@ "96" ] }, - "execution_count": 9, + "execution_count": 10, "metadata": {}, "output_type": "execute_result" } @@ -298,7 +332,7 @@ }, { "cell_type": "code", - "execution_count": 10, + "execution_count": 11, "metadata": {}, "outputs": [ { @@ -372,8 +406,8 @@ "IN + ARCTIC + TERRAIN + AN + ANCIENT + EERIE + ICE + TRACT + I + ENTER + A + TRANCE = FLATIANA| 42 + 379549 + 5877342 + 32 + 3294825 + 88748 + 498 + 57395 + 4 + 82587 + 3 + 573298 = 10354323\n", "ONE < TWO < THREE < SEVEN - THREE < THREE + TWO < THREE + THREE < SEVEN < SEVEN + ONE < THREE * THREE| 321 < 483 < 45711 < 91612 - 45711 < 45711 + 483 < 45711 + 45711 < 91612 < 91612 + 321 < 45711 * 45711\n", "AN + ACCELERATING + INFERENTIAL + ENGINEERING + TALE + ELITE + GRANT + FEE + ET + CETERA = ARTIFICIAL + INTELLIGENCE| 59 + 577404251698 + 69342491650 + 49869442698 + 1504 + 40614 + 82591 + 344 + 41 + 741425 = 5216367650 + 691400684974\n", - "CPU times: user 49.7 s, sys: 216 ms, total: 50 s\n", - "Wall time: 50.2 s\n" + "CPU times: user 51.2 s, sys: 246 ms, total: 51.4 s\n", + "Wall time: 51.7 s\n" ] }, { @@ -382,7 +416,7 @@ "67" ] }, - "execution_count": 10, + "execution_count": 11, "metadata": {}, "output_type": "execute_result" }