<div style="text-align: right"><i>Peter Norvig<br>March 2020</i></div>

# Elemental Spelling

Consider this problem: 

*Given a word, decide if it can be spelled using only the symbols in the **[periodic table](https://en.wikipedia.org/wiki/Periodic_table)** of elements. For example, the word "bananas" can be spelled with "BaNaNaS" (Barium-Sodium-Sodium-Sulfur). There may be multiple possible spellings for a word–"bananas" could also be "BaNaNAs'" (Barium-Sodium-Nitrogen-Arsenic).*

To start, here is the periodic table, which I've called `elements`:

In [1]:
elements = dict(H='Hydrogen', He='Helium', Li='Lithium', Be='Beryllium', B='Boron', 
C='Carbon', N='Nitrogen', O='Oxygen', F='Fluorine', Ne='Neon', Na='Sodium', Mg='Magnesium', 
Al='Aluminium', Si='Silicon', P='Phosphorus', S='Sulfur', Cl='Chlorine', Ar='Argon', 
K='Potassium', Ca='Calcium', Sc='Scandium', Ti='Titanium', V='Vanadium', Cr='Chromium', 
Mn='Manganese', Fe='Iron', Co='Cobalt', Ni='Nickel', Cu='Copper', Zn='Zinc', Ga='Gallium', 
Ge='Germanium', As='Arsenic', Se='Selenium', Br='Bromine', Kr='Krypton', Rb='Rubidium', 
Sr='Strontium', Y='Yttrium', Zr='Zirconium', Nb='Niobium', Mo='Molybdenum', Tc='Technetium', 
Ru='Ruthenium', Rh='Rhodium', Pd='Palladium', Ag='Silver', Cd='Cadmium', In='Indium', Sn='Tin', 
Sb='Antimony', Te='Tellurium', I='Iodine', Xe='Xenon', Cs='Cesium', Ba='Barium', La='Lanthanum', 
Ce='Cerium', Pr='Praseodymium', Nd='Neodymium', Pm='Promethium', Sm='Samarium', Eu='Europium', 
Gd='Gadolinium', Tb='Terbium', Dy='Dysprosium', Ho='Holmium', Er='Erbium', Tm='Thulium', 
Yb='Ytterbium', Lu='Lutetium', Hf='Hafnium', Ta='Tantalum', W='Tungsten', Re='Rhenium', 
Os='Osmium', Ir='Iridium', Pt='Platinum', Au='Gold', Hg='Mercury', Tl='Thallium', Pb='Lead', 
Bi='Bismuth', Po='Polonium', At='Astatine', Rn='Radon', Fr='Francium', Ra='Radium', Ac='Actinium', 
Th='Thorium', Pa='Protactinium', U='Uranium', Np='Neptunium', Pu='Plutonium', Am='Americium', 
Cm='Curium', Bk='Berkelium', Cf='Californium', Es='Einsteinium', Fm='Fermium', Md='Mendelevium', 
No='Nobelium', Lr='Lawrencium', Rf='Rutherfordium', Db='Dubnium', Sg='Seaborgium', Bh='Bohrium', 
Hs='Hassium', Mt='Meitnerium', Ds='Darmstadtium', Rg='Roentgenium', Cn='Copernicium', Nh='Nihonium', 
Fl='Flerovium', Mc='Moscovium', Lv='Livermorium', Ts='Tennessine', Og='Oganesson')

In [2]:
assert len(elements) == 118
assert 'H' in elements and 'He' in elements and 'Fire' not in elements

Here is a recursive algorithm to solve the problem. A word is **spellable** if any of three cases hold: 
1) The word is the empty string.
2) The first **one** character of the word (capitalized) forms an element symbol, and the rest of the word is spellable.
3) The first **two** characters of the word (capitalized) forms an element symbol, and the rest of the word is spellable.

Here is the code:

In [3]:
def spellable(word: str) -> bool:
    """Can we spell `word` by concatenating symbols in `elements`?"""
    def case(k: int) -> bool: 
        return word[:k].capitalize() in elements and spellable(word[k:])
    return word == '' or case(1) or case(2)

We can test the function on two examples:

In [4]:
spellable('bananas')

True

In [5]:
spellable('yogurt')

False

That was easy! But maybe you'd like to see the actual spellings:`'BaNaNaS'` or `'BaNaNAs'`. The function `spellings` does that. The general idea is the same (same three cases). However, each case returns a **set** of possible spellings. It is important to distinguish between the spellings of an unspellable word (the empty set) and the spellings of the empty string (a set consisting of one spelling, the empty string).

In [6]:
def spellings(word) -> set:
    """All spellings of `word` formed by concatenating symbols in `elements`."""
    def case(k: int) -> set:
        head, tail = word[:k].capitalize(), word[k:]
        if head in elements:
            return {head + rest for rest in spellings(tail)}
        else:
            return set()
    return {''} if word == '' else case(1) | case(2)

The two examples:

In [7]:
spellings('bananas')

{'BaNaNAs', 'BaNaNaS'}

In [8]:
spellings('yogurt') 

set()

# Testing

Here I define `bad`, a list of words that are **not** spellable, and `good`, a list of words that **are**. Then I make some assertions:

In [9]:
bad  = 'hello world failure not an alternative'.split() # Unspellable words

good = '''howdy orb nonsuccess is notan option 
bananas wonky nutso psychic attention functions officious hyperbolic 
vichyssois bobbysocks  phony whippersnappers soupspoons buffoonish 
bilateralism capabilities alterabilities cioppino pincushion 
onionskins unprofessional biostatistical copernicus inconspicuous 
nonpoisonous floccinaucinihilipilification'''.split() # Spellable words

for w in bad:
    assert not spellable(w) and not spellings(w)
for w in good:
    assert spellable(w) and spellings(w)

And here are the actual spellings for the good words:

In [10]:
[spellings(w) for w in good]

[{'HOWDy', 'HoWDy'},
 {'ORb'},
 {'NONSUCCEsS', 'NONSUCCeSS', 'NoNSUCCEsS', 'NoNSUCCeSS'},
 {'IS'},
 {'NOTaN', 'NoTaN'},
 {'OPTiON', 'OPtION'},
 {'BaNaNAs', 'BaNaNaS'},
 {'WONKY'},
 {'NUTsO'},
 {'PSYCHIC'},
 {'AtTeNTiON'},
 {'FUNCTiONS'},
 {'OFFICIOUS'},
 {'HYPErBOLiC'},
 {'VICHYSSOIS'},
 {'BOBBYSOCKS'},
 {'PHONY', 'PHoNY'},
 {'WHIPPErSNaPPErS'},
 {'SOUPSPOONS', 'SOUPSPoONS'},
 {'BUFFOONISH', 'BUFFOONiSH'},
 {'BILaTeRaLiSm', 'BiLaTeRaLiSm'},
 {'CaPaBILiTiEs', 'CaPaBiLiTiEs'},
 {'AlTeRaBILiTiEs', 'AlTeRaBiLiTiEs'},
 {'CIOPPINO', 'CIOPPINo', 'CIOPPInO'},
 {'PINCUSHION', 'PINCuSHION', 'PInCUSHION', 'PInCuSHION'},
 {'ONIONSKINS', 'ONIONSKInS', 'ONiONSKINS', 'ONiONSKInS'},
 {'UNPrOFEsSIONAl', 'UNPrOFEsSiONAl', 'UNPrOFeSSIONAl', 'UNPrOFeSSiONAl'},
 {'BIOSTaTiSTiCAl', 'BIOsTaTiSTiCAl', 'BiOSTaTiSTiCAl', 'BiOsTaTiSTiCAl'},
 {'COPErNICUS',
  'COPErNICuS',
  'COPErNiCUS',
  'COPErNiCuS',
  'CoPErNICUS',
  'CoPErNICuS',
  'CoPErNiCUS',
  'CoPErNiCuS'},
 {'INCONSPICUOUS',
  'INCONSPICuOUS',
  'INCo

What about spelling the actual names of the elements using the element symbols? We see below that only 15 out of 118  are spellable. 

`%time` tells us this took only about a millisecond to do 236 calls to `spellings`.

In [11]:
%time [spellings(w) for w in elements.values() if spellings(w)]

CPU times: user 1.05 ms, sys: 7 µs, total: 1.05 ms
Wall time: 1.07 ms


[{'CArBON', 'CaRbON'},
 {'NeON'},
 {'SILiCON', 'SILiCoN', 'SiLiCON', 'SiLiCoN'},
 {'PHOSPHORuS',
  'PHOSPHoRuS',
  'PHOsPHORuS',
  'PHOsPHoRuS',
  'PHoSPHORuS',
  'PHoSPHoRuS'},
 {'IrON'},
 {'COPPEr', 'CoPPEr'},
 {'ArSeNIC', 'ArSeNiC'},
 {'KrYPtON'},
 {'SILvEr', 'SiLvEr'},
 {'TiN'},
 {'XeNON', 'XeNoN'},
 {'BISmUTh', 'BiSmUTh'},
 {'AsTaTiNe'},
 {'TeNNEsSINe', 'TeNNEsSiNe', 'TeNNeSSINe', 'TeNNeSSiNe'},
 {'OGaNEsSON', 'OGaNeSSON'}]