{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "
Peter Norvig 2014
\n", "\n", "# Cryptarithmetic (Alphametic) Problems\n", "\n", "On April 28, Gary Antonik had another [Numberplay column](http://wordplay.blogs.nytimes.com/2014/04/28/num/) that quotes my friend Bill Gosper. (Gosper often presents more advanced puzzles in the [math-fun](http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun) mailing list.) This puzzle was:\n", "\n", "> For the expression `N U M + B E R = P L A Y`,\n", "> - Which distinct numerals (each different) can be substituted for letters to make a valid expression?\n", "> - How many solutions are there?\n", "\n", "I [tackled](https://www.udacity.com/wiki/cs212/unit-2#rethinking-eval) this type of problem (known as a [cryptarithmetic](http://mathworld.wolfram.com/Cryptarithmetic.html) or [alphametic](http://mathworld.wolfram.com/Alphametic.html) problem) in my Udacity class [CS 212](https://www.udacity.com/wiki/cs212/unit-2#cryptarithmetic). \n", "\n", "My initial approach was simple: [when in doubt, use brute force](https://www.brainyquote.com/quotes/ken_thompson_185574). I try all permutations of digits replacing letters (that should be quick and easy—there can be at most 10 factorial or 3.6 million permutations), then for each one, I use Python's `eval` function to see if the resulting string is a valid expression. The basic idea is simple, but there are a few complications to worry about:\n", "\n", "1. Math uses `=` and Python uses `==` for equality; we'll allow either one in formulas.\n", "2. A number with a leading zero (like `012`) is illegal (but `0` by itself is ok).\n", "4. If we try to eval an expression like 1/0, we'll have to handle the divide-by-zero error.\n", "5. Only uppercase letters are replaced by digits. So in `2 * BE or not 2`, the `or` and `not` are Python keywords.\n", "6. Should I find one solution (faster) or all solutions (more complete)? I'll handle both use cases by \n", "implementing my function `solve` to return an iterator, which yields solutions one at a time; you can get the first one with `next` or all of them with `set`. \n", "\n", "# The solution: `solve`\n", "\n", "Below we see that `solve` works by generating every way to replace letters in the formula with numbers,\n", "and then filtering them to keep only valid strings (ones that evaluate to true and have no leading zero).\n", "The `str.translate` method is used to do the replacements." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import itertools\n", "import re\n", "\n", "def solve(formula):\n", " \"\"\"Given a formula like 'NUM + BER = PLAY', fill in digits to solve it.\n", " Generate all valid digit-filled-in strings.\"\"\"\n", " return filter(valid, letter_replacements(formula))\n", "\n", "def letter_replacements(formula):\n", " \"\"\"All possible replacements of letters with digits in formula.\"\"\"\n", " formula = formula.replace(' = ', ' == ') # Allow = or ==\n", " letters = cat(set(re.findall('[A-Z]', formula)))\n", " for digits in itertools.permutations('1234567890', len(letters)):\n", " yield formula.translate(str.maketrans(letters, cat(digits)))\n", "\n", "def valid(exp):\n", " \"\"\"Expression is valid iff it has no leading zero, and evaluates to true.\"\"\"\n", " try:\n", " return not leading_zero(exp) and eval(exp) is True\n", " except ArithmeticError:\n", " return False\n", " \n", "cat = ''.join # Function to concatenate strings\n", " \n", "leading_zero = re.compile(r'\\b0[0-9]').search # Function to check for illegal number" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'489 + 537 == 1026'" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "next(solve('NUM + BER = PLAY'))" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CPU times: user 17.2 s, sys: 61.3 ms, total: 17.3 s\n", "Wall time: 17.4 s\n" ] }, { "data": { "text/plain": [ "96" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%time len(set(solve('NUM + BER = PLAY')))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# A faster solution: `faster_solve`\n", "\n", "Depending on your computer, that probably took 15 or 20 seconds. Can we make it faster? To answer the question, I start by profiling to see where the time is spent. I can use the magic function `%prun` to profile:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " " ] } ], "source": [ "%prun next(solve('NUM + BER = PLAY'))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We see that about 2/3 of the time is spent in `eval`. So let's eliminate the calls to `eval`. That should be doable, because the expression we are evaluating is basically the same each time, but with different permutations of digits filled in. We could save a lot of work if we convert the expression into a Python function, compile that function just once, and then call the function for each of the 3.6 million permutations of digits. We want to take an expression such as:\n", "\n", " \"NUM + BER == PLAY\"\n", " \n", "and transform it into the Python function:\n", "\n", " (lambda A,B,E,L,M,N,P,R,U,Y: \n", " (100*N+10*U+M) + (100*B+10*E+R) == (1000*P+100*L+10*A+Y))\n", " \n", "Actually that's not quite right. The rules say that \"N\", \"B\", and \"P\" cannot be zero. So the function should be:\n", "\n", " (lambda A,B,E,L,M,N,P,R,U,Y: \n", " B and N and P and ((100*N+10*U+M) + (100*B+10*E+R) == (1000*P+100*L+10*A+Y)))\n", "\n", "Here is the code to compile a formula into a Python function: " ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "def compile_formula(formula, verbose=False):\n", " \"\"\"Compile formula into a function. Also return letters found, as a str,\n", " in same order as parms of function. For example, 'YOU == ME**2' returns\n", " (lambda E,M,O,U,Y: M and Y and ((100*Y+10*O+U) == (10*M+E)**2), 'YMEUO'\"\"\"\n", " formula = formula.replace(' = ', ' == ')\n", " letters = cat(sorted(set(re.findall('[A-Z]', formula))))\n", " firstletters = sorted(set(re.findall(r'\\b([A-Z])[A-Z]', formula)))\n", " body = re.sub('[A-Z]+', compile_word, formula)\n", " body = ' and '.join(firstletters + [body])\n", " fn = 'lambda {}: {}'.format(','.join(letters), body)\n", " if verbose: print(fn)\n", " assert len(letters) <= 10\n", " return eval(fn), letters\n", "\n", "def compile_word(matchobj):\n", " \"Compile the word 'YOU' as '(100*Y + 10*O + U)'.\"\n", " word = matchobj.group()\n", " terms = reversed([mult(10**i, L) for (i, L) in enumerate(reversed(word))])\n", " return '(' + '+'.join(terms) + ')'\n", "\n", "def mult(factor, var): return var if factor == 1 else str(factor) + '*' + var" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "lambda E,M,O,U,Y: M and Y and (100*Y+10*O+U) == (10*M+E)**2\n" ] }, { "data": { "text/plain": [ "((E, M, O, U, Y)>, 'EMOUY')" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "compile_formula(\"YOU == ME**2\", verbose=True)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "lambda A,B,E,L,M,N,P,R,U,Y: B and N and P and (100*N+10*U+M) + (100*B+10*E+R) == (1000*P+100*L+10*A+Y)\n" ] }, { "data": { "text/plain": [ "((A, B, E, L, M, N, P, R, U, Y)>, 'ABELMNPRUY')" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "compile_formula(\"NUM + BER = PLAY\", verbose=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we're ready for the faster version of `solve`:" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "def faster_solve(formula):\n", " \"\"\"Given a formula like 'NUM + BER == PLAY', fill in digits to solve it.\n", " This version precompiles the formula and generates all digit-filled-in strings.\"\"\"\n", " fn, letters = compile_formula(formula)\n", " for digits in itertools.permutations((1,2,3,4,5,6,7,8,9,0), len(letters)):\n", " try:\n", " if fn(*digits):\n", " yield formula.translate(str.maketrans(letters, cat(map(str, digits))))\n", " except ArithmeticError: \n", " pass" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'587 + 439 = 1026'" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "next(faster_solve('NUM + BER = PLAY'))" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CPU times: user 1.32 s, sys: 8.41 ms, total: 1.33 s\n", "Wall time: 1.33 s\n" ] }, { "data": { "text/plain": [ "96" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%time len(list(faster_solve('NUM + BER = PLAY')))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We get the same answer, but `faster_solve` is 15 times faster than `solve`." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# More Examples" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "NUM + BER = PLAY | 587 + 439 = 1026\n", "YOU = ME ** 2 | 576 = 24 ** 2\n", "SEND + MORE = MONEY | 9567 + 1085 = 10652\n", "PI * R**2 = AREA | 96 * 7**2 = 4704\n", "WRONG + WRONG = RIGHT | 37081 + 37081 = 74162\n", "WRIGHT + WRIGHT = TO * FLY + FLIGHT | 413058 + 413058 = 82 * 769 + 763058\n", "TWO + TWENTY = TWELVE + TEN | 784 + 781279 = 781351 + 712\n", "A**2 + B**2 = C**2 | 3**2 + 4**2 = 5**2\n", "AYH**2 + BEE**2 = SEE**2 | 760**2 + 522**2 = 922**2\n", "RAMN = R**3 + RM**3 = N**3 + RX**3 | 1729 = 1**3 + 12**3 = 9**3 + 10**3\n", "MON-EY = EVIL**(1/2) | 108-42 = 4356**(1/2)\n", "ALPHABET + LETTERS = SCRABBLE | 17531908 + 7088062 = 24619970\n", "SIN**2 + COS**2 = UNITY | 235**2 + 142**2 = 75389\n", "POTATO + TOMATO = PUMPKIN | 168486 + 863486 = 1031972\n", "ODD * ODD = FREAKY | 677 * 677 = 458329\n", "DOUBLE + DOUBLE + TOIL = TROUBLE | 798064 + 798064 + 1936 = 1598064\n", "WASH + YOUR = HANDS | 5291 + 6748 = 12039\n", "SPEED + LIMIT = FIFTY | 40221 + 36568 = 76789\n", "TERRIBLE + NUMBER = THIRTEEN | 45881795 + 302758 = 46184553\n", "SEVEN + SEVEN + SIX = TWENTY | 68782 + 68782 + 650 = 138214\n", "EIGHT + EIGHT + TWO + ONE + ONE = TWENTY | 52371 + 52371 + 104 + 485 + 485 = 105816\n", "ELEVEN + NINE + FIVE + FIVE = THIRTY | 797275 + 5057 + 4027 + 4027 = 810386\n", "NINE + SEVEN + SEVEN + SEVEN = THIRTY | 3239 + 49793 + 49793 + 49793 = 152618\n", "FOURTEEN + TEN + TEN + SEVEN = FORTYONE | 19564882 + 482 + 482 + 78082 = 19643928\n", "HAWAII + IDAHO + IOWA + OHIO = STATES | 510199 + 98153 + 9301 + 3593 = 621246\n", "VIOLIN * 2 + VIOLA = TRIO + SONATA | 176478 * 2 + 17645 = 2076 + 368525\n", "SEND + A + TAD + MORE = MONEY | 9283 + 7 + 473 + 1062 = 10825\n", "ZEROES + ONES = BINARY | 698392 + 3192 = 701584\n", "DCLIZ + DLXVI = MCCXXV | 62049 + 60834 = 122883\n", "COUPLE + COUPLE = QUARTET | 653924 + 653924 = 1307848\n", "FISH + N + CHIPS = SUPPER | 5718 + 3 + 98741 = 104462\n", "SATURN + URANUS + NEPTUNE + PLUTO = PLANETS | 127503 + 502351 + 3947539 + 46578 = 4623971\n", "PLUTO not in {PLANETS} | 63985 not in {6314287}\n", "EARTH + AIR + FIRE + WATER = NATURE | 67432 + 704 + 8046 + 97364 = 173546\n", "TWO * TWO = SQUARE | 807 * 807 = 651249\n", "HIP * HIP = HURRAY | 958 * 958 = 917764\n", "NORTH / SOUTH = EAST / WEST | 51304 / 61904 = 7260 / 8760\n", "NAUGHT ** 2 = ZERO ** 3 | 328509 ** 2 = 4761 ** 3\n", "I + THINK + IT + BE + THINE = INDEED | 1 + 64125 + 16 + 73 + 64123 = 128338\n", "DO + YOU + FEEL = LUCKY | 57 + 870 + 9441 = 10368\n", "WELL - DO + YOU = PUNK | 5277 - 13 + 830 = 6094\n", "NOW + WE + KNOW + THE = TRUTH | 673 + 38 + 9673 + 128 = 10512\n", "SORRY + TO + BE + A + PARTY = POOPER | 80556 + 20 + 34 + 9 + 19526 = 100145\n", "SORRY + TO + BUST + YOUR = BUBBLE | 94665 + 24 + 1092 + 5406 = 101187\n", "STEEL + BELTED = RADIALS | 87336 + 936732 = 1024068\n", "ABRA + CADABRA + ABRA + CADABRA = HOUDINI | 7457 + 1797457 + 7457 + 1797457 = 3609828\n", "I + GUESS + THE + TRUTH = HURTS | 5 + 26811 + 478 + 49647 = 76941\n", "LETS + CUT + TO + THE = CHASE | 9478 + 127 + 75 + 704 = 10384\n", "THATS + THE + THEORY = ANYWAY | 86987 + 863 + 863241 = 951091\n", "TWO + TWO = FOUR | 734 + 734 = 1468\n", "X / X = X | 1 / 1 = 1\n", "A**N + B**N = C**N and N > 1 | 3**2 + 4**2 = 5**2 and 2 > 1\n", "ATOM**0.5 = A + TO + M | 1296**0.5 = 1 + 29 + 6\n", "GLITTERS is not GOLD | 35499278 is not 3651\n", "ONE < TWO and FOUR < FIVE | 351 < 703 and 2386 < 2491\n", "ONE < TWO < THREE < TWO * TWO | 431 < 674 < 62511 < 674 * 674\n", "(2 * BE or not 2 * BE) = THE + QUES-TION | (2 * 13 or not 2 * 13) = 843 + 7239-8056\n", "sum(range(AA)) = BB | sum(range(11)) = 55\n", "sum(range(POP)) = BOBO | sum(range(101)) = 5050\n", "ODD + ODD = EVEN | 655 + 655 = 1310\n", "ROMANS+ALSO+MORE+OR+LESS+ADDED = LETTERS | 975348+3187+5790+79+1088+36606 = 1022098\n", "FOUR+ONE==FIVE and ONE+ONE+ONE+ONE+ONE==FIVE | 1380+345==1725 and 345+345+345+345+345==1725\n", "TEN+SEVEN+SEVEN+SEVEN+FOUR+FOUR+ONE = FORTY | 520+12820+12820+12820+4937+4937+902 = 49756\n", "NINETEEN+THIRTEEN+THREE+2*TWO+3*ONE = FORTYTWO| 42415114+56275114+56711+2*538+3*841 = 98750538\n", "IN + ARCTIC + TERRAIN + AN + ANCIENT + EERIE + ICE + TRACT + I + ENTER + A + TRANCE = FLATIANA| 42 + 379549 + 5877342 + 32 + 3294825 + 88748 + 498 + 57395 + 4 + 82587 + 3 + 573298 = 10354323\n", "ONE < TWO < THREE < SEVEN - THREE < THREE + TWO < THREE + THREE < SEVEN < SEVEN + ONE < THREE * THREE| 321 < 483 < 45711 < 91612 - 45711 < 45711 + 483 < 45711 + 45711 < 91612 < 91612 + 321 < 45711 * 45711\n", "AN + ACCELERATING + INFERENTIAL + ENGINEERING + TALE + ELITE + GRANT + FEE + ET + CETERA = ARTIFICIAL + INTELLIGENCE| 59 + 577404251698 + 69342491650 + 49869442698 + 1504 + 40614 + 82591 + 344 + 41 + 741425 = 5216367650 + 691400684974\n", "CPU times: user 51.2 s, sys: 246 ms, total: 51.4 s\n", "Wall time: 51.7 s\n" ] }, { "data": { "text/plain": [ "67" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "examples = \"\"\"\n", "NUM + BER = PLAY\n", "YOU = ME ** 2\n", "SEND + MORE = MONEY\n", "PI * R**2 = AREA\n", "WRONG + WRONG = RIGHT\n", "WRIGHT + WRIGHT = TO * FLY + FLIGHT\n", "TWO + TWENTY = TWELVE + TEN\n", "A**2 + B**2 = C**2\n", "AYH**2 + BEE**2 = SEE**2\n", "RAMN = R**3 + RM**3 = N**3 + RX**3\n", "MON-EY = EVIL**(1/2)\n", "ALPHABET + LETTERS = SCRABBLE\n", "SIN**2 + COS**2 = UNITY\n", "POTATO + TOMATO = PUMPKIN\n", "ODD * ODD = FREAKY\n", "DOUBLE + DOUBLE + TOIL = TROUBLE\n", "WASH + YOUR = HANDS\n", "SPEED + LIMIT = FIFTY\n", "TERRIBLE + NUMBER = THIRTEEN\n", "SEVEN + SEVEN + SIX = TWENTY\n", "EIGHT + EIGHT + TWO + ONE + ONE = TWENTY\n", "ELEVEN + NINE + FIVE + FIVE = THIRTY\n", "NINE + SEVEN + SEVEN + SEVEN = THIRTY\n", "FOURTEEN + TEN + TEN + SEVEN = FORTYONE\n", "HAWAII + IDAHO + IOWA + OHIO = STATES\n", "VIOLIN * 2 + VIOLA = TRIO + SONATA\n", "SEND + A + TAD + MORE = MONEY\n", "ZEROES + ONES = BINARY\n", "DCLIZ + DLXVI = MCCXXV\n", "COUPLE + COUPLE = QUARTET\n", "FISH + N + CHIPS = SUPPER\n", "SATURN + URANUS + NEPTUNE + PLUTO = PLANETS\n", "PLUTO not in {PLANETS}\n", "EARTH + AIR + FIRE + WATER = NATURE\n", "TWO * TWO = SQUARE\n", "HIP * HIP = HURRAY\n", "NORTH / SOUTH = EAST / WEST\n", "NAUGHT ** 2 = ZERO ** 3\n", "I + THINK + IT + BE + THINE = INDEED\n", "DO + YOU + FEEL = LUCKY\n", "WELL - DO + YOU = PUNK\n", "NOW + WE + KNOW + THE = TRUTH\n", "SORRY + TO + BE + A + PARTY = POOPER\n", "SORRY + TO + BUST + YOUR = BUBBLE\n", "STEEL + BELTED = RADIALS\n", "ABRA + CADABRA + ABRA + CADABRA = HOUDINI\n", "I + GUESS + THE + TRUTH = HURTS\n", "LETS + CUT + TO + THE = CHASE\n", "THATS + THE + THEORY = ANYWAY\n", "TWO + TWO = FOUR\n", "X / X = X\n", "A**N + B**N = C**N and N > 1\n", "ATOM**0.5 = A + TO + M\n", "GLITTERS is not GOLD\n", "ONE < TWO and FOUR < FIVE\n", "ONE < TWO < THREE < TWO * TWO\n", "(2 * BE or not 2 * BE) = THE + QUES-TION\n", "sum(range(AA)) = BB\n", "sum(range(POP)) = BOBO\n", "ODD + ODD = EVEN\n", "ROMANS+ALSO+MORE+OR+LESS+ADDED = LETTERS\n", "FOUR+ONE==FIVE and ONE+ONE+ONE+ONE+ONE==FIVE\n", "TEN+SEVEN+SEVEN+SEVEN+FOUR+FOUR+ONE = FORTY\n", "NINETEEN+THIRTEEN+THREE+2*TWO+3*ONE = FORTYTWO\n", "IN + ARCTIC + TERRAIN + AN + ANCIENT + EERIE + ICE + TRACT + I + ENTER + A + TRANCE = FLATIANA\n", "ONE < TWO < THREE < SEVEN - THREE < THREE + TWO < THREE + THREE < SEVEN < SEVEN + ONE < THREE * THREE\n", "AN + ACCELERATING + INFERENTIAL + ENGINEERING + TALE + ELITE + GRANT + FEE + ET + CETERA = ARTIFICIAL + INTELLIGENCE\n", "\"\"\".strip().splitlines()\n", "\n", "def run(examples):\n", " for e in examples:\n", " print('{:45}| {}'.format(e, next(faster_solve(e))))\n", " return len(examples)\n", " \n", "%time run(examples)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.0" } }, "nbformat": 4, "nbformat_minor": 2 }