\n",
"\n",
"# The Languages of English, Math, and Programming\n",
"\n",
"My colleague [Wei-Hwa Huang](https://en.wikipedia.org/wiki/Wei-Hwa_Huang) gave several AI chatbots this prompt: \n",
"\n",
"**List all the ways in which three distinct positive integers have a product of 108.**\n",
"\n",
"I tested this prompt on the following solvers:\n",
"- [A human programmer](https://github.com/norvig/)\n",
"- [Gemini Advanced](https://gemini.google.com/app)\n",
"- [ChatGPT 4o](https://chatgpt.com/)\n",
"- [Microsoft Copilot](https://copilot.microsoft.com/)\n",
"- [Anthropic Claude 3.5 Sonnet](https://claude.ai/new)\n",
"- [Meta AI Llama 3](https://www.meta.ai/)\n",
"- [Perplexity](https://www.perplexity.ai/)\n",
"- [Cohere Chat](https://cohere.com/chat)\n",
"- [HuggingFace Chat](https://huggingface.co/chat/)\n",
"- [You.com](https://you.com/)\n",
"\n",
"All the LLMs Wei-Hwa originally tried got this one wrong. From my expanded list, Gemini, ChatGPT 4o, You.com and the human got it right, and 5 other models made mistakes:\n",
"- The LLMs all started their answer by noting that 108 = 2 × 2 × 3 × 3 × 3, and then tried to partition those factors into three distinct subsets and report all ways to do so.\n",
"- So far so good.\n",
"- But most of them forgot that 1 could be a factor of 108 (or equivalently, that the empty set of factors is a valid subset). \n",
"- Some of the models ignored the need for \"distinct\" integers, and proposed, say, 3 × 6 × 6.\n",
"- Some got 5 or 6 correct triplets, and then stopped, perhaps because their attention mechanism didn't go back far enough.\n",
"- SOme even proposed non-integers as \"factors\".\n",
"\n",
"I thought that the models might have skipped 1 as a factor because 1 is not listed in the prime factorization, so it is easy to forget. But in programming, it is more natural to run a loop from 1 to *n* than from 2 to *n*, so this error would be less likely. Therefore, I decided to test all the models with the following prompt: \n",
"\n",
"**Write a Python program to list all the ways in which three distinct positive integers have a product of 108.**\n",
"\n",
"# TLDR: Conclusion\n",
"\n",
"The models did much better with this prompt. My conclusion is that the language used to solve a problem matters. Sometimes a natural language such as English is a good choice, sometimes you need the language of mathematical equations, or maybe chemical equations, and sometimes a programming language is best.\n",
"\n",
"# Human\n",
"\n",
"A human (me) was able to correctly respond to the prompt:"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "f8a27ed0-c2b1-47a0-bdf0-c6a8cd789dc5",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[{1, 2, 54},\n",
" {1, 3, 36},\n",
" {1, 4, 27},\n",
" {1, 6, 18},\n",
" {1, 9, 12},\n",
" {2, 3, 18},\n",
" {2, 6, 9},\n",
" {3, 4, 9}]"
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from math import prod\n",
"from itertools import combinations\n",
"from typing import *\n",
"\n",
"def find_products(k=3, n=108) -> List[Set[int]]:\n",
" \"\"\"A list of all ways in which `k` distinct positive integers have a product of `n`.\"\"\" \n",
" factors = {i for i in range(1, n + 1) if n % i == 0}\n",
" return [set(ints) for ints in combinations(factors, k) if prod(ints) == n]\n",
"\n",
"find_products()"
]
},
{
"cell_type": "markdown",
"id": "b7682af0-8c46-4e19-bdba-71f28bbfa101",
"metadata": {},
"source": [
"The program can solve different versions of the problem (although it would be slow when `n` is over a billion or so):"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "55dfa9c2-f366-42c8-ae50-6a1df80c47b3",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[{1, 2, 3, 4, 15},\n",
" {1, 2, 3, 5, 12},\n",
" {1, 2, 3, 6, 10},\n",
" {1, 2, 4, 5, 9},\n",
" {1, 3, 4, 5, 6}]"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"find_products(5, 360)"
]
},
{
"cell_type": "markdown",
"id": "42fc5098-2315-4984-b256-4fbf2f6fec51",
"metadata": {},
"source": [
"# Gemini Advanced\n",
"\n",
"Gemini produced three drafts, of which the following one was correct. In another draft, it had the line `k = product // (i * j)`, using integer division, which is incompatible with the `k.is_integer()` test. Here is the correct draft:"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "29789909-15d1-4833-b35f-cb5399893ba2",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[(1, 2, 54), (1, 3, 36), (1, 4, 27), (1, 6, 18), (1, 9, 12), (2, 3, 18), (2, 6, 9), (3, 4, 9)]\n"
]
}
],
"source": [
"def find_combinations(product):\n",
" \"\"\"Finds all combinations of three distinct positive integers that multiply to the given product.\n",
"\n",
" Args:\n",
" product: The product to find combinations for.\n",
"\n",
" Returns:\n",
" A list of tuples, where each tuple represents a combination of three integers.\n",
" \"\"\"\n",
"\n",
" combinations = []\n",
" for i in range(1, product):\n",
" for j in range(i + 1, product):\n",
" k = product / (i * j)\n",
" if k > j and k.is_integer():\n",
" combinations.append((i, j, int(k)))\n",
" return combinations\n",
"\n",
"product = 108\n",
"result = find_combinations(product)\n",
"print(result)"
]
},
{
"cell_type": "markdown",
"id": "81f032a3-5621-470e-9ada-8d3e026ff1cc",
"metadata": {},
"source": [
"# ChatGPT 4o\n",
"\n",
"ChatGPT got it right (although it imported `itertools` and then never used it)."
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "c248a136-5ed5-40dd-ad48-b303e99c3675",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"All distinct triplets of integers that multiply to 108:\n",
"(1, 2, 54)\n",
"(1, 3, 36)\n",
"(1, 4, 27)\n",
"(1, 6, 18)\n",
"(1, 9, 12)\n",
"(2, 3, 18)\n",
"(2, 6, 9)\n",
"(3, 4, 9)\n"
]
}
],
"source": [
"import itertools\n",
"\n",
"def find_triplets_with_product(product):\n",
" solutions = []\n",
" \n",
" # Iterate over all possible combinations of distinct integers\n",
" for a in range(1, int(product ** (1/3)) + 1):\n",
" for b in range(a + 1, product // a + 1):\n",
" for c in range(b + 1, product // (a * b) + 1):\n",
" if a * b * c == product:\n",
" solutions.append((a, b, c))\n",
" \n",
" return solutions\n",
"\n",
"product = 108\n",
"triplets = find_triplets_with_product(product)\n",
"\n",
"if triplets:\n",
" print(f\"All distinct triplets of integers that multiply to {product}:\")\n",
" for triplet in triplets:\n",
" print(triplet)\n",
"else:\n",
" print(f\"No distinct triplets found that multiply to {product}.\")\n"
]
},
{
"cell_type": "markdown",
"id": "17e74293-feab-4fff-b682-bc26823ebefa",
"metadata": {},
"source": [
"# Bing CoPilot\n",
"\n",
"Bing produces a very clean (but somewhat slower) `find_triplets` function."
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "2b0ee5b6-64bd-4581-bdc7-d22065f02d64",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"All distinct triplets with a product of 108:\n",
"(1, 2, 54)\n",
"(1, 3, 36)\n",
"(1, 4, 27)\n",
"(1, 6, 18)\n",
"(1, 9, 12)\n",
"(2, 3, 18)\n",
"(2, 6, 9)\n",
"(3, 4, 9)\n"
]
}
],
"source": [
"def find_triplets(product):\n",
" triplets = []\n",
" for a in range(1, product + 1):\n",
" for b in range(a + 1, product + 1):\n",
" for c in range(b + 1, product + 1):\n",
" if a * b * c == product:\n",
" triplets.append((a, b, c))\n",
" return triplets\n",
"\n",
"product = 108\n",
"triplets = find_triplets(product)\n",
"\n",
"print(f\"All distinct triplets with a product of {product}:\")\n",
"for triplet in triplets:\n",
" print(triplet)"
]
},
{
"cell_type": "markdown",
"id": "fddeabc0-4925-4145-b666-7a094fd61980",
"metadata": {},
"source": [
"# Claude 3.5 Sonnet\n",
"\n",
"Claude's answer is quite simlar to ChatGPT 4o's (but I would criticize it for not taking a parameter):"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "83f9088c-8444-4207-ad24-91d8e3a98004",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"There are 8 ways to express 108 as a product of three distinct positive integers:\n",
"1 x 2 x 54 = 108\n",
"1 x 3 x 36 = 108\n",
"1 x 4 x 27 = 108\n",
"1 x 6 x 18 = 108\n",
"1 x 9 x 12 = 108\n",
"2 x 3 x 18 = 108\n",
"2 x 6 x 9 = 108\n",
"3 x 4 x 9 = 108\n"
]
}
],
"source": [
"def find_combinations():\n",
" result = []\n",
" for i in range(1, 108):\n",
" for j in range(i + 1, 108):\n",
" k = 108 // (i * j)\n",
" if i * j * k == 108 and k > j:\n",
" result.append((i, j, k))\n",
" return result\n",
"\n",
"def main():\n",
" combinations = find_combinations()\n",
" print(f\"There are {len(combinations)} ways to express 108 as a product of three distinct positive integers:\")\n",
" for combo in combinations:\n",
" print(f\"{combo[0]} x {combo[1]} x {combo[2]} = 108\")\n",
"\n",
"if __name__ == \"__main__\":\n",
" main()"
]
},
{
"cell_type": "markdown",
"id": "eefc3007-105d-41cd-9023-0a34609517fd",
"metadata": {},
"source": [
"# Meta AI Llama 3.1\n",
"\n",
"This program forgets to test for `i < j < k`, and thus produces six permutations of each triplet. Oddly, it claims that \"*When you run this program, it will output: [(1, 2, 54), (1, 3, 36), (1, 4, 27), (1, 6, 18), (1, 9, 12), (2, 3, 18), (2, 6, 9), (3, 4, 9)]*\", so it knows the correct answer, but it didn't run the program to test that it is correct."
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "de1d0cc3-10a8-417a-a35a-0a66325f549c",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[(1, 2, 54), (1, 3, 36), (1, 4, 27), (1, 6, 18), (1, 9, 12), (1, 12, 9), (1, 18, 6), (1, 27, 4), (1, 36, 3), (1, 54, 2), (2, 1, 54), (2, 3, 18), (2, 6, 9), (2, 9, 6), (2, 18, 3), (2, 54, 1), (3, 1, 36), (3, 2, 18), (3, 4, 9), (3, 9, 4), (3, 18, 2), (3, 36, 1), (4, 1, 27), (4, 3, 9), (4, 9, 3), (4, 27, 1), (6, 1, 18), (6, 2, 9), (6, 9, 2), (6, 18, 1), (9, 1, 12), (9, 2, 6), (9, 3, 4), (9, 4, 3), (9, 6, 2), (9, 12, 1), (12, 1, 9), (12, 9, 1), (18, 1, 6), (18, 2, 3), (18, 3, 2), (18, 6, 1), (27, 1, 4), (27, 4, 1), (36, 1, 3), (36, 3, 1), (54, 1, 2), (54, 2, 1)]\n"
]
}
],
"source": [
"def find_factors(n):\n",
" factors = []\n",
" for i in range(1, n + 1):\n",
" if n % i == 0:\n",
" factors.append(i)\n",
" return factors\n",
"\n",
"def find_triplets(n):\n",
" factors = find_factors(n)\n",
" triplets = []\n",
" for i in factors:\n",
" for j in factors:\n",
" if i * j <= n:\n",
" k = n // (i * j)\n",
" if i != j and j != k and i != k and i * j * k == n:\n",
" triplets.append((i, j, k))\n",
" return triplets\n",
"\n",
"print(find_triplets(108))"
]
},
{
"cell_type": "markdown",
"id": "742f3b13-a53f-497a-97e7-5aa58b361951",
"metadata": {},
"source": [
"# Perplexity\n",
"\n",
"Perplexity does well."
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "7a764a01-8f18-4773-bd7a-fbc8dc652450",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"All combinations of three distinct positive integers with a product of 108:\n",
"1 x 2 x 54 = 108\n",
"1 x 3 x 36 = 108\n",
"1 x 4 x 27 = 108\n",
"1 x 6 x 18 = 108\n",
"1 x 9 x 12 = 108\n",
"2 x 3 x 18 = 108\n",
"2 x 6 x 9 = 108\n",
"3 x 4 x 9 = 108\n",
"\n",
"Total number of combinations: 8\n"
]
}
],
"source": [
"def find_combinations(target):\n",
" combinations = []\n",
" \n",
" for i in range(1, target + 1):\n",
" if target % i == 0:\n",
" for j in range(i + 1, target + 1):\n",
" if (target // i) % j == 0:\n",
" k = target // (i * j)\n",
" if k > j and i * j * k == target:\n",
" combinations.append((i, j, k))\n",
" \n",
" return combinations\n",
"\n",
"target_product = 108\n",
"result = find_combinations(target_product)\n",
"\n",
"print(f\"All combinations of three distinct positive integers with a product of {target_product}:\")\n",
"for combo in result:\n",
" print(f\"{combo[0]} x {combo[1]} x {combo[2]} = {target_product}\")\n",
"\n",
"print(f\"\\nTotal number of combinations: {len(result)}\")"
]
},
{
"cell_type": "markdown",
"id": "53f85624-bfe3-42cc-b55b-b86a56089314",
"metadata": {},
"source": [
"# Cohere Chat Playground\n",
"\n",
"Cohere fails to find any combinations. The 5th line should just be `k = product // i // j`; Cohere bizarrely adds `* j * i`. It also fails to test that `i < j < k`."
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "61139601-ced2-4598-ae76-464c1041d2d1",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"All combinations of three distinct positive integers with a product of 108:\n"
]
}
],
"source": [
"def find_combinations(product):\n",
" combinations = []\n",
" for i in range(1, product // 3 + 1):\n",
" for j in range(i, product // 2 + 1):\n",
" k = product // i // j * j * i\n",
" if i * j * k == product and i != j and j != k and i != k:\n",
" combinations.append((i, j, k))\n",
" return combinations\n",
"\n",
"product = 108\n",
"combinations = find_combinations(product)\n",
"print(f\"All combinations of three distinct positive integers with a product of {product}:\")\n",
"for combo in combinations:\n",
" print(combo)"
]
},
{
"cell_type": "markdown",
"id": "51b0b353-79d2-4da7-a6da-8231d5ff811e",
"metadata": {},
"source": [
"# HuggingChat\n",
"\n",
"Hugging Chat produced a correct concise program. I note that `i < j < k` would be cleaner than `k >= j and i!= j and j!= k` here (and for others as well)."
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "6e4b823b-66f1-48ae-b4ca-00387d480e03",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[(1, 2, 54), (1, 3, 36), (1, 4, 27), (1, 6, 18), (1, 9, 12), (2, 3, 18), (2, 6, 9), (3, 4, 9)]\n"
]
}
],
"source": [
"def find_triplets(n):\n",
" triplets = []\n",
" for i in range(1, n):\n",
" for j in range(i+1, n):\n",
" k = n // (i * j)\n",
" if k >= j and i * j * k == n and i!= j and j!= k:\n",
" triplets.append((i, j, k))\n",
" return triplets\n",
"\n",
"print(find_triplets(108))"
]
},
{
"cell_type": "markdown",
"id": "23d78bca-9b35-43c0-891a-a9c0e6801b31",
"metadata": {},
"source": [
"# You.com\n",
"\n",
"You.com produces a correct solution, with some nice optimizations that make it *O*(*n*5/6), whereas most of the solutions are *O*(*n*2). This means it can handle a 14-digit product in a second of run time, whereas the human-written solution can only handle 10-digit products in one second, while the HuggingChat version (for example) takes several seconds just to handle a 5-digit product."
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "fe69bb04-02de-4f38-a8ab-6693c15b02c4",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[(1, 2, 54), (1, 3, 36), (1, 4, 27), (1, 6, 18), (1, 9, 12), (2, 3, 18), (2, 6, 9), (3, 4, 9)]\n"
]
}
],
"source": [
"def find_triplets(product):\n",
" triplets = []\n",
" for i in range(1, int(product ** (1/3)) + 1): # The cube root of the product is the maximum possible value for i\n",
" if product % i == 0:\n",
" for j in range(i + 1, int((product / i) ** 0.5) + 1): # The square root of the product divided by i is the maximum possible value for j\n",
" if (product / i) % j == 0:\n",
" k = product // (i * j)\n",
" if k > j: # Ensure the integers are distinct\n",
" triplets.append((i, j, k))\n",
" return triplets\n",
"\n",
"triplets = find_triplets(108)\n",
"print(triplets)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "fc16ca96-0b05-4ad4-82c0-552bb99373fd",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.15"
}
},
"nbformat": 4,
"nbformat_minor": 5
}