{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "19ee7dde-0d74-47e8-8d0d-4ffcb99e2f5a",
   "metadata": {},
   "source": [
    "<div align=\"right\"><i>Peter Norvig<br>Sept 25, 2024</i></div>\n",
    "\n",
    "# The Languages of English, Math, and Programming\n",
    "\n",
    "My colleague [Wei-Hwa Huang](https://en.wikipedia.org/wiki/Wei-Hwa_Huang) gave several AI chatbots this prompt: \n",
    "\n",
    "**List all the ways in which three distinct positive integers have a product of 108.**\n",
    "\n",
    "I tested this prompt on the following solvers:\n",
    "- [A human programmer](https://github.com/norvig/)\n",
    "- [Gemini Advanced](https://gemini.google.com/app)\n",
    "- [ChatGPT 4o](https://chatgpt.com/)\n",
    "- [Microsoft Copilot](https://copilot.microsoft.com/)\n",
    "- [Anthropic Claude 3.5 Sonnet](https://claude.ai/new)\n",
    "- [Meta AI Llama 3](https://www.meta.ai/)\n",
    "- [Perplexity](https://www.perplexity.ai/)\n",
    "- [Cohere Chat](https://cohere.com/chat)\n",
    "- [HuggingFace Chat](https://huggingface.co/chat/)\n",
    "- [You.com](https://you.com/)\n",
    "\n",
    "All the LLMs Wei-Hwa originally tried got this one wrong. From my expanded list, Gemini, ChatGPT 4o, You.com and the human got it right, and 5 other models made mistakes:\n",
    "- The LLMs all started their answer by noting that 108 = 2 × 2 × 3 × 3 × 3, and then tried to partition those factors into three distinct subsets and report all ways to do so.\n",
    "- So far so good.\n",
    "- But most of them forgot that 1 could be a factor of 108 (or equivalently, that the empty set of factors is a valid subset). \n",
    "- Some of the models ignored the need for \"distinct\" integers, and proposed, say, 3 × 6 × 6.\n",
    "- Some got 5 or 6 correct triplets, and then stopped, perhaps because their attention mechanism didn't go back far enough.\n",
    "- SOme even proposed non-integers as \"factors\".\n",
    "\n",
    "I thought that the models might have skipped 1 as a factor because 1 is not listed in the prime factorization, so it is easy to forget. But in programming, it is more natural to run a loop from 1 to *n* than from 2 to *n*, so this error would be less likely. Therefore, I decided to test all the models  with the following prompt: \n",
    "\n",
    "**Write a Python program to list all the ways in which three distinct positive integers have a product of 108.**\n",
    "\n",
    "# TLDR: Conclusion\n",
    "\n",
    "The models did much better with this prompt. My conclusion is that the language used to solve a problem matters. Sometimes a natural language such as English is a good choice, sometimes you need the language of mathematical equations, or maybe chemical equations, and sometimes a programming language is best.\n",
    "\n",
    "# Human\n",
    "\n",
    "A human (me) was able to correctly respond to the prompt:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "f8a27ed0-c2b1-47a0-bdf0-c6a8cd789dc5",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[{1, 2, 54},\n",
       " {1, 3, 36},\n",
       " {1, 4, 27},\n",
       " {1, 6, 18},\n",
       " {1, 9, 12},\n",
       " {2, 3, 18},\n",
       " {2, 6, 9},\n",
       " {3, 4, 9}]"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from math      import prod\n",
    "from itertools import combinations\n",
    "from typing    import *\n",
    "\n",
    "def find_products(k=3, n=108) -> List[Set[int]]:\n",
    "    \"\"\"A list of all ways in which `k` distinct positive integers have a product of `n`.\"\"\" \n",
    "    factors = {i for i in range(1, n + 1) if n % i == 0}\n",
    "    return [set(ints) for ints in combinations(factors, k) if prod(ints) == n]\n",
    "\n",
    "find_products()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b7682af0-8c46-4e19-bdba-71f28bbfa101",
   "metadata": {},
   "source": [
    "The program can solve different versions of the problem (although it would be slow when `n` is over a billion or so):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "55dfa9c2-f366-42c8-ae50-6a1df80c47b3",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[{1, 2, 3, 4, 15},\n",
       " {1, 2, 3, 5, 12},\n",
       " {1, 2, 3, 6, 10},\n",
       " {1, 2, 4, 5, 9},\n",
       " {1, 3, 4, 5, 6}]"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "find_products(5, 360)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "42fc5098-2315-4984-b256-4fbf2f6fec51",
   "metadata": {},
   "source": [
    "# Gemini Advanced\n",
    "\n",
    "Gemini produced three drafts, of which the following one was correct. In another draft, it had the line `k = product // (i * j)`, using integer division, which is incompatible with the `k.is_integer()` test. Here is the correct draft:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "29789909-15d1-4833-b35f-cb5399893ba2",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[(1, 2, 54), (1, 3, 36), (1, 4, 27), (1, 6, 18), (1, 9, 12), (2, 3, 18), (2, 6, 9), (3, 4, 9)]\n"
     ]
    }
   ],
   "source": [
    "def find_combinations(product):\n",
    "  \"\"\"Finds all combinations of three distinct positive integers that multiply to the given product.\n",
    "\n",
    "  Args:\n",
    "    product: The product to find combinations for.\n",
    "\n",
    "  Returns:\n",
    "    A list of tuples, where each tuple represents a combination of three integers.\n",
    "  \"\"\"\n",
    "\n",
    "  combinations = []\n",
    "  for i in range(1, product):\n",
    "    for j in range(i + 1, product):\n",
    "      k = product / (i * j)\n",
    "      if k > j and k.is_integer():\n",
    "        combinations.append((i, j, int(k)))\n",
    "  return combinations\n",
    "\n",
    "product = 108\n",
    "result = find_combinations(product)\n",
    "print(result)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "81f032a3-5621-470e-9ada-8d3e026ff1cc",
   "metadata": {},
   "source": [
    "# ChatGPT 4o\n",
    "\n",
    "ChatGPT got it right (although it imported `itertools` and then never used it)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "c248a136-5ed5-40dd-ad48-b303e99c3675",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "All distinct triplets of integers that multiply to 108:\n",
      "(1, 2, 54)\n",
      "(1, 3, 36)\n",
      "(1, 4, 27)\n",
      "(1, 6, 18)\n",
      "(1, 9, 12)\n",
      "(2, 3, 18)\n",
      "(2, 6, 9)\n",
      "(3, 4, 9)\n"
     ]
    }
   ],
   "source": [
    "import itertools\n",
    "\n",
    "def find_triplets_with_product(product):\n",
    "    solutions = []\n",
    "    \n",
    "    # Iterate over all possible combinations of distinct integers\n",
    "    for a in range(1, int(product ** (1/3)) + 1):\n",
    "        for b in range(a + 1, product // a + 1):\n",
    "            for c in range(b + 1, product // (a * b) + 1):\n",
    "                if a * b * c == product:\n",
    "                    solutions.append((a, b, c))\n",
    "    \n",
    "    return solutions\n",
    "\n",
    "product = 108\n",
    "triplets = find_triplets_with_product(product)\n",
    "\n",
    "if triplets:\n",
    "    print(f\"All distinct triplets of integers that multiply to {product}:\")\n",
    "    for triplet in triplets:\n",
    "        print(triplet)\n",
    "else:\n",
    "    print(f\"No distinct triplets found that multiply to {product}.\")\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "17e74293-feab-4fff-b682-bc26823ebefa",
   "metadata": {},
   "source": [
    "# Bing CoPilot\n",
    "\n",
    "Bing produces a very clean (but somewhat slower) `find_triplets` function."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "2b0ee5b6-64bd-4581-bdc7-d22065f02d64",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "All distinct triplets with a product of 108:\n",
      "(1, 2, 54)\n",
      "(1, 3, 36)\n",
      "(1, 4, 27)\n",
      "(1, 6, 18)\n",
      "(1, 9, 12)\n",
      "(2, 3, 18)\n",
      "(2, 6, 9)\n",
      "(3, 4, 9)\n"
     ]
    }
   ],
   "source": [
    "def find_triplets(product):\n",
    "    triplets = []\n",
    "    for a in range(1, product + 1):\n",
    "        for b in range(a + 1, product + 1):\n",
    "            for c in range(b + 1, product + 1):\n",
    "                if a * b * c == product:\n",
    "                    triplets.append((a, b, c))\n",
    "    return triplets\n",
    "\n",
    "product = 108\n",
    "triplets = find_triplets(product)\n",
    "\n",
    "print(f\"All distinct triplets with a product of {product}:\")\n",
    "for triplet in triplets:\n",
    "    print(triplet)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fddeabc0-4925-4145-b666-7a094fd61980",
   "metadata": {},
   "source": [
    "# Claude 3.5 Sonnet\n",
    "\n",
    "Claude's answer is quite simlar to ChatGPT 4o's (but I would criticize it for not taking a parameter):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "83f9088c-8444-4207-ad24-91d8e3a98004",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "There are 8 ways to express 108 as a product of three distinct positive integers:\n",
      "1 x 2 x 54 = 108\n",
      "1 x 3 x 36 = 108\n",
      "1 x 4 x 27 = 108\n",
      "1 x 6 x 18 = 108\n",
      "1 x 9 x 12 = 108\n",
      "2 x 3 x 18 = 108\n",
      "2 x 6 x 9 = 108\n",
      "3 x 4 x 9 = 108\n"
     ]
    }
   ],
   "source": [
    "def find_combinations():\n",
    "    result = []\n",
    "    for i in range(1, 108):\n",
    "        for j in range(i + 1, 108):\n",
    "            k = 108 // (i * j)\n",
    "            if i * j * k == 108 and k > j:\n",
    "                result.append((i, j, k))\n",
    "    return result\n",
    "\n",
    "def main():\n",
    "    combinations = find_combinations()\n",
    "    print(f\"There are {len(combinations)} ways to express 108 as a product of three distinct positive integers:\")\n",
    "    for combo in combinations:\n",
    "        print(f\"{combo[0]} x {combo[1]} x {combo[2]} = 108\")\n",
    "\n",
    "if __name__ == \"__main__\":\n",
    "    main()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "eefc3007-105d-41cd-9023-0a34609517fd",
   "metadata": {},
   "source": [
    "# Meta AI Llama 3.1\n",
    "\n",
    "This program forgets to test for `i < j < k`, and thus produces six permutations of each triplet. Oddly, it claims that \"*When you run this program, it will output: [(1, 2, 54), (1, 3, 36), (1, 4, 27), (1, 6, 18), (1, 9, 12), (2, 3, 18), (2, 6, 9), (3, 4, 9)]*\", so it knows the correct answer, but it didn't run the program to test that it is correct."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "de1d0cc3-10a8-417a-a35a-0a66325f549c",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[(1, 2, 54), (1, 3, 36), (1, 4, 27), (1, 6, 18), (1, 9, 12), (1, 12, 9), (1, 18, 6), (1, 27, 4), (1, 36, 3), (1, 54, 2), (2, 1, 54), (2, 3, 18), (2, 6, 9), (2, 9, 6), (2, 18, 3), (2, 54, 1), (3, 1, 36), (3, 2, 18), (3, 4, 9), (3, 9, 4), (3, 18, 2), (3, 36, 1), (4, 1, 27), (4, 3, 9), (4, 9, 3), (4, 27, 1), (6, 1, 18), (6, 2, 9), (6, 9, 2), (6, 18, 1), (9, 1, 12), (9, 2, 6), (9, 3, 4), (9, 4, 3), (9, 6, 2), (9, 12, 1), (12, 1, 9), (12, 9, 1), (18, 1, 6), (18, 2, 3), (18, 3, 2), (18, 6, 1), (27, 1, 4), (27, 4, 1), (36, 1, 3), (36, 3, 1), (54, 1, 2), (54, 2, 1)]\n"
     ]
    }
   ],
   "source": [
    "def find_factors(n):\n",
    "    factors = []\n",
    "    for i in range(1, n + 1):\n",
    "        if n % i == 0:\n",
    "            factors.append(i)\n",
    "    return factors\n",
    "\n",
    "def find_triplets(n):\n",
    "    factors = find_factors(n)\n",
    "    triplets = []\n",
    "    for i in factors:\n",
    "        for j in factors:\n",
    "            if i * j <= n:\n",
    "                k = n // (i * j)\n",
    "                if i != j and j != k and i != k and i * j * k == n:\n",
    "                    triplets.append((i, j, k))\n",
    "    return triplets\n",
    "\n",
    "print(find_triplets(108))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "742f3b13-a53f-497a-97e7-5aa58b361951",
   "metadata": {},
   "source": [
    "# Perplexity\n",
    "\n",
    "Perplexity does well."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "7a764a01-8f18-4773-bd7a-fbc8dc652450",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "All combinations of three distinct positive integers with a product of 108:\n",
      "1 x 2 x 54 = 108\n",
      "1 x 3 x 36 = 108\n",
      "1 x 4 x 27 = 108\n",
      "1 x 6 x 18 = 108\n",
      "1 x 9 x 12 = 108\n",
      "2 x 3 x 18 = 108\n",
      "2 x 6 x 9 = 108\n",
      "3 x 4 x 9 = 108\n",
      "\n",
      "Total number of combinations: 8\n"
     ]
    }
   ],
   "source": [
    "def find_combinations(target):\n",
    "    combinations = []\n",
    "    \n",
    "    for i in range(1, target + 1):\n",
    "        if target % i == 0:\n",
    "            for j in range(i + 1, target + 1):\n",
    "                if (target // i) % j == 0:\n",
    "                    k = target // (i * j)\n",
    "                    if k > j and i * j * k == target:\n",
    "                        combinations.append((i, j, k))\n",
    "    \n",
    "    return combinations\n",
    "\n",
    "target_product = 108\n",
    "result = find_combinations(target_product)\n",
    "\n",
    "print(f\"All combinations of three distinct positive integers with a product of {target_product}:\")\n",
    "for combo in result:\n",
    "    print(f\"{combo[0]} x {combo[1]} x {combo[2]} = {target_product}\")\n",
    "\n",
    "print(f\"\\nTotal number of combinations: {len(result)}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "53f85624-bfe3-42cc-b55b-b86a56089314",
   "metadata": {},
   "source": [
    "# Cohere Chat Playground\n",
    "\n",
    "Cohere fails to find any combinations. The 5th line should just be `k = product // i // j`; Cohere bizarrely adds `* j * i`. It also fails to test that `i < j < k`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "61139601-ced2-4598-ae76-464c1041d2d1",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "All combinations of three distinct positive integers with a product of 108:\n"
     ]
    }
   ],
   "source": [
    "def find_combinations(product):\n",
    "    combinations = []\n",
    "    for i in range(1, product // 3 + 1):\n",
    "        for j in range(i, product // 2 + 1):\n",
    "            k = product // i // j * j * i\n",
    "            if i * j * k == product and i != j and j != k and i != k:\n",
    "                combinations.append((i, j, k))\n",
    "    return combinations\n",
    "\n",
    "product = 108\n",
    "combinations = find_combinations(product)\n",
    "print(f\"All combinations of three distinct positive integers with a product of {product}:\")\n",
    "for combo in combinations:\n",
    "    print(combo)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "51b0b353-79d2-4da7-a6da-8231d5ff811e",
   "metadata": {},
   "source": [
    "# HuggingChat\n",
    "\n",
    "Hugging Chat produced a correct concise program. I note that `i < j < k` would be cleaner than `k >= j and i!= j and j!= k` here (and for others as well)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "6e4b823b-66f1-48ae-b4ca-00387d480e03",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[(1, 2, 54), (1, 3, 36), (1, 4, 27), (1, 6, 18), (1, 9, 12), (2, 3, 18), (2, 6, 9), (3, 4, 9)]\n"
     ]
    }
   ],
   "source": [
    "def find_triplets(n):\n",
    "    triplets = []\n",
    "    for i in range(1, n):\n",
    "        for j in range(i+1, n):\n",
    "            k = n // (i * j)\n",
    "            if k >= j and i * j * k == n and i!= j and j!= k:\n",
    "                triplets.append((i, j, k))\n",
    "    return triplets\n",
    "\n",
    "print(find_triplets(108))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "23d78bca-9b35-43c0-891a-a9c0e6801b31",
   "metadata": {},
   "source": [
    "# You.com\n",
    "\n",
    "You.com produces a correct solution, with some nice optimizations that make it *O*(*n*<sup>5/6</sup>), whereas most of the solutions are *O*(*n*<sup>2</sup>). This means it can handle a 14-digit product in a second of run time, whereas the human-written solution can only handle 10-digit products in one second, while the HuggingChat version (for example) takes several seconds just to handle a 5-digit product."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "fe69bb04-02de-4f38-a8ab-6693c15b02c4",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[(1, 2, 54), (1, 3, 36), (1, 4, 27), (1, 6, 18), (1, 9, 12), (2, 3, 18), (2, 6, 9), (3, 4, 9)]\n"
     ]
    }
   ],
   "source": [
    "def find_triplets(product):\n",
    "    triplets = []\n",
    "    for i in range(1, int(product ** (1/3)) + 1):  # The cube root of the product is the maximum possible value for i\n",
    "        if product % i == 0:\n",
    "            for j in range(i + 1, int((product / i) ** 0.5) + 1):  # The square root of the product divided by i is the maximum possible value for j\n",
    "                if (product / i) % j == 0:\n",
    "                    k = product // (i * j)\n",
    "                    if k > j:  # Ensure the integers are distinct\n",
    "                        triplets.append((i, j, k))\n",
    "    return triplets\n",
    "\n",
    "triplets = find_triplets(108)\n",
    "print(triplets)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "fc16ca96-0b05-4ad4-82c0-552bb99373fd",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.15"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}