Advent 2025 one more day
This commit is contained in:
@@ -11,15 +11,27 @@
|
||||
"\n",
|
||||
"I enjoy doing the [**Advent of Code**](https://adventofcode.com/) (AoC) programming puzzles, and **my** solutions are [**over here**](Advent2025.ipynb). \n",
|
||||
"\n",
|
||||
"In **this** notebook I show some solutions by various AI Large Language Models: Gemini, Claude, and ChatGPT. Each day I'll choose a model and give it the prompt \"*Write code to solve the following problem:*\" followed by the full text of the AoC Part 1 problem description. Then I'll pronmpt again with \"*There is a change to the specification:*\" followed by the AoC Part 2 description.\n",
|
||||
"In **this** notebook I show some solutions by various AI Large Language Models: Gemini, Claude, and ChatGPT.\n",
|
||||
"\n",
|
||||
"In order to understand what's going on here, you'll have to look at the problem descriptions at [**Advent of Code**](https://adventofcode.com/2025).\n",
|
||||
"\n",
|
||||
"Each day I'll choose a model and give it the prompt \"*Write code to solve the following problem:*\" followed by the full text of the AoC Part 1 problem description. Then I'll prompt again with \"*There is a change to the specification:*\" followed by the AoC Part 2 description. So far the LLMs are doing quite well. \n",
|
||||
"\n",
|
||||
"For brevity, I have removed some of the models' output, such as:\n",
|
||||
"- Prose descriptions of the programs. (In most cases these were aaccurate and thorough!)\n",
|
||||
"- The \"__main__\"` idiom for running code from the command line\n",
|
||||
"- Test examples to run.\n",
|
||||
"\n",
|
||||
"Overall, the models did well, producing code that gives the correct answer in a reasonable run time. Some of the cosde could be improved stylistically. (But I guess if you're vibe coding and not looking at the code, maybe that doesn't matter.)\n",
|
||||
"\n",
|
||||
"# Day 0\n",
|
||||
"\n",
|
||||
"I load my [**AdventUtils.ipynb**](AdventUtils.ipynb) and set the`current_year` so I can access my input files with `get_text(day_number)` and can use my `answer` function to verify whether the AI systems get the right answer."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 7,
|
||||
"execution_count": 1,
|
||||
"id": "34563e0b-09c5-4600-a455-3ff0f31b81a0",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
@@ -35,12 +47,12 @@
|
||||
"source": [
|
||||
"# [Day 1](https://adventofcode.com/2025/day/1) Gemini 3 Pro\n",
|
||||
"\n",
|
||||
"I started with the Gemini 3 Pro Fast model, which produced a nice analysis of the problem (which I have omitted) and the following code:"
|
||||
"I started with the Gemini 3 Pro Fast model, which produced this following code:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 60,
|
||||
"execution_count": 2,
|
||||
"id": "a03354cc-951c-4609-80e1-a7e1a6d0c2bd",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
@@ -103,22 +115,22 @@
|
||||
"id": "c5219309-4f2f-49b3-8208-8c7a3526e328",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"A bit verbose and over-commented, but pretty nice. Below I interface my data (obtained by a call to `get_text(1)`) to this function and verify that it is correct:"
|
||||
"A bit verbose and over-commented, but pretty nice. Below I fetch the Day 1 data with `get_text(1)`, split it into lines (as expected by Gemini's `solve_safe_password` function), and verify that the answer is correct:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 49,
|
||||
"execution_count": 3,
|
||||
"id": "c3ff875a-6267-47b8-a032-115a40bfd246",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"Puzzle 1.1: .0014 seconds, answer 1182 ok"
|
||||
"Puzzle 1.1: .0013 seconds, answer 1182 ok"
|
||||
]
|
||||
},
|
||||
"execution_count": 49,
|
||||
"execution_count": 3,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
@@ -135,12 +147,12 @@
|
||||
"id": "47a04265-2dab-4a4c-9a24-e1c4c1b93c6b",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"I then gave Gemini Fast the instructions for Part 2, and again it had a nice analysis and the following code:"
|
||||
"For Part 2 I gave Gemini Fast the updated instructions, and again it had a nice analysis and the following code:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 11,
|
||||
"execution_count": 4,
|
||||
"id": "a1cf0e58-4567-4713-9d4b-e57f5c958800",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
@@ -219,29 +231,7 @@
|
||||
" \n",
|
||||
" current_position = (current_position + rotation_value) % MODULO\n",
|
||||
"\n",
|
||||
" return total_zero_crossings\n",
|
||||
"\n",
|
||||
"# --- EXAMPLE INPUT VERIFICATION ---\n",
|
||||
"example_rotations = [\n",
|
||||
" \"L68\", \"L30\", \"R48\", \"L5\", \"R60\", \n",
|
||||
" \"L55\", \"L1\", \"L99\", \"R14\", \"L82\"\n",
|
||||
"]\n",
|
||||
"# Expected output for example is 6.\n",
|
||||
"# print(f\"Example verification: {solve_new_safe_password(example_rotations)}\") \n",
|
||||
"# The function will output 6, confirming the logic.\n",
|
||||
"\n",
|
||||
"# --- YOUR PUZZLE INPUT GOES HERE ---\n",
|
||||
"# Replace this list with your actual puzzle input data\n",
|
||||
"puzzle_input = [\n",
|
||||
" # Paste your list of rotations here, one string per line\n",
|
||||
" # e.g., \"R10\", \"L5\", ...\n",
|
||||
" \"L68\", \"L30\", \"R48\", \"L5\", \"R60\", \"L55\", \"L1\", \"L99\", \"R14\", \"L82000\",\n",
|
||||
" # ... and so on for all your input lines\n",
|
||||
"]\n",
|
||||
"\n",
|
||||
"# Run the simulation with your full input\n",
|
||||
"#actual_password = solve_new_safe_password(puzzle_input)\n",
|
||||
"#print(f\"The actual password (total zero crossings) is: {actual_password}\")"
|
||||
" return total_zero_crossings"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -249,24 +239,24 @@
|
||||
"id": "f8e02f20-87b1-4bd1-9bdb-2bd187fc42cb",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Note that the code does `import math` but then never uses the math module. (I've done that, but I try to clean things up.)\n",
|
||||
"Note that the code does `import math` but then never uses the math module. (I've done that, but I try to eventually clean things up. Maybe that's the different between Gemini \"Fast\" and Gemini \"Thinking\" models.)\n",
|
||||
"\n",
|
||||
"Here I check whether the code is correct:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 14,
|
||||
"execution_count": 5,
|
||||
"id": "a0b3e0fc-e044-4c1e-b4b6-6bc735145f57",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"Puzzle 1.2: .0017 seconds, answer 7509 WRONG; expected answer is 6907"
|
||||
"Puzzle 1.2: .0018 seconds, answer 7509 WRONG; expected answer is 6907"
|
||||
]
|
||||
},
|
||||
"execution_count": 14,
|
||||
"execution_count": 5,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
@@ -281,7 +271,7 @@
|
||||
"id": "d640d8e4-6940-416d-93af-b55d82482a35",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Too bad. I switched to the Gemini 3 Pro Thinking model and gave the prompt \"*That's not quite right.*\" It responded with\n",
|
||||
"Too bad. I switched to the Gemini 3 Pro \"Thinking\" model and gave the prompt \"*That's not quite right.*\" It responded with\n",
|
||||
"\n",
|
||||
">It seems I made a subtle logic error in my previous calculation for the Right (R) rotation.\n",
|
||||
">\n",
|
||||
@@ -292,7 +282,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 17,
|
||||
"execution_count": 6,
|
||||
"id": "196a6a93-e318-4a45-a023-9997fa0fb0f5",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
@@ -344,24 +334,7 @@
|
||||
" else:\n",
|
||||
" current_position = (current_position - distance) % 100\n",
|
||||
"\n",
|
||||
" return total_zero_crossings\n",
|
||||
"\n",
|
||||
"# --- Example Input Verification ---\n",
|
||||
"example_rotations = [\n",
|
||||
" \"L68\", \"L30\", \"R48\", \"L5\", \"R60\", \n",
|
||||
" \"L55\", \"L1\", \"L99\", \"R14\", \"L82\"\n",
|
||||
"]\n",
|
||||
"# Expected output: 6\n",
|
||||
"# print(f\"Example result: {solve_safe_password_method_2(example_rotations)}\")\n",
|
||||
"\n",
|
||||
"# --- YOUR PUZZLE INPUT GOES HERE ---\n",
|
||||
"puzzle_input = [\n",
|
||||
" # Copy and paste your full input list here\n",
|
||||
" # \"L10\", \"R5\", etc...\n",
|
||||
"]\n",
|
||||
"\n",
|
||||
"# result = solve_safe_password_method_2(puzzle_input)\n",
|
||||
"# print(f\"The actual password is: {result}\")"
|
||||
" return total_zero_crossings"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -374,17 +347,17 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 19,
|
||||
"execution_count": 7,
|
||||
"id": "021043ed-2ac5-4aaf-8171-985f0f5911f2",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"Puzzle 1.2: .0015 seconds, answer 6907 ok"
|
||||
"Puzzle 1.2: .0024 seconds, answer 6907 ok"
|
||||
]
|
||||
},
|
||||
"execution_count": 19,
|
||||
"execution_count": 7,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
@@ -401,12 +374,12 @@
|
||||
"source": [
|
||||
"# [Day 2](https://adventofcode.com/2025/day/2) Claude Opus 4.5\n",
|
||||
"\n",
|
||||
"I gave Claude the instructions for Day 2 Part 1 and it wrote some code and then asked me to paste in the input file. I did and Claude ran the code, producing the correct answer but printing a lot of unneccessary debugging output along the way. I prompted it to \"*Change the code to not print anything, just return the answer*\" and got the following:"
|
||||
"I gave Claude the instructions for Day 2 Part 1 and it wrote some code that produces the correct answer but prints a lot of unneccessary debugging output along the way. I prompted it to \"*Change the code to not print anything, just return the answer*\" and got the following:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 105,
|
||||
"execution_count": 8,
|
||||
"id": "8eac98f3-b884-4d95-b38b-ea4365ec3004",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
@@ -447,24 +420,24 @@
|
||||
"id": "2bd0db00-952b-47e5-b787-b3887b7539f1",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"This code is overall rather nice, but conspicously lacks comments and doc strings. It uses the more efficient \"enumerate over the first half of the digit string\" strategy, but is not precise in narrowing down the range it enumerates over. For example, if the range is \"999000-109000\", this code will enumnrate the range(100, 1000), when it could enumerate just range(999, 1000).\n",
|
||||
"This code is overall rather nice, but conspicously lacks comments and doc strings. It uses the more efficient \"enumerate over the first half of the digit string\" strategy, but is not precise in narrowing down the range it enumerates over. For example, for the range \"999000-109000\", this code will enumerate the range (100, 1000), when it could enumerate just the range (999, 1000).\n",
|
||||
"\n",
|
||||
"I verified that the code gives the correct answer:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 98,
|
||||
"execution_count": 9,
|
||||
"id": "a91845ec-ace7-482e-b0b5-8a620ef3461f",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"Puzzle 2.1: .1303 seconds, answer 23560874270 ok"
|
||||
"Puzzle 2.1: .1263 seconds, answer 23560874270 ok"
|
||||
]
|
||||
},
|
||||
"execution_count": 98,
|
||||
"execution_count": 9,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
@@ -481,12 +454,12 @@
|
||||
"id": "a31d006f-8cf2-4e4c-92d3-d7b7def22227",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Claude then wrote the following code when given the Part 2 instructions, nicely generalizing to any number of repeats:"
|
||||
"When given the Part 2 instructions, Claude wrote the following code:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 99,
|
||||
"execution_count": 10,
|
||||
"id": "f0dc176b-dd85-40a4-ac5c-dfa936a6a524",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
@@ -538,17 +511,17 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 100,
|
||||
"execution_count": 11,
|
||||
"id": "9c0049e6-a992-4aa8-a2d7-3ea748e361a6",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"Puzzle 2.2: .1816 seconds, answer 44143124633 ok"
|
||||
"Puzzle 2.2: .1350 seconds, answer 44143124633 ok"
|
||||
]
|
||||
},
|
||||
"execution_count": 100,
|
||||
"execution_count": 11,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
@@ -565,12 +538,12 @@
|
||||
"source": [
|
||||
"# [Day 3](https://adventofcode.com/2025/day/3) ChatGPT 5.1 Auto\n",
|
||||
"\n",
|
||||
"ChatGPT gave a very brief analysis of the problem and then wrote a program that was designed to be called from the command line, using the `if __name__ == \"__main__\"` idiom. I told it \"I don't want to run it like that, I want a function that I can pass the input text and have it return an int.\" and it produced this code:\n"
|
||||
"ChatGPT gave a very brief analysis of the problem and then wrote a program that was designed to be called from the command line, using the `\"__main__\"` idiom. I told it \"I don't want to run it like that, I want a function that I can pass the input text and have it return an int.\" and it produced this code (lacking comments or doc strings):"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 55,
|
||||
"execution_count": 12,
|
||||
"id": "3aa266f3-50d0-4d8d-a464-4c74c52daa69",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
@@ -609,17 +582,17 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 56,
|
||||
"execution_count": 13,
|
||||
"id": "09bf306b-8762-4346-aff9-bcff33639c71",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"Puzzle 3.1: .0040 seconds, answer 17085 ok"
|
||||
"Puzzle 3.1: .0044 seconds, answer 17085 ok"
|
||||
]
|
||||
},
|
||||
"execution_count": 56,
|
||||
"execution_count": 13,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
@@ -632,16 +605,16 @@
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "f0398d5b-485d-4479-9321-878564180b68",
|
||||
"cell_type": "markdown",
|
||||
"id": "4a07f37f-c5e3-4484-a7b1-2cae0ff5bd01",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
"source": [
|
||||
"For Part 2 ChatGPT did well (and for some reason included comments and doc strings this time):"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 58,
|
||||
"execution_count": 14,
|
||||
"id": "bdb8b4e4-bed0-48dc-a045-47cd4c6002fd",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
@@ -692,17 +665,17 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 59,
|
||||
"execution_count": 15,
|
||||
"id": "70bde9b9-beb1-4e9d-bef6-0f20fb958891",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"Puzzle 3.2: .0069 seconds, answer 169408143086082 ok"
|
||||
"Puzzle 3.2: .0076 seconds, answer 169408143086082 ok"
|
||||
]
|
||||
},
|
||||
"execution_count": 59,
|
||||
"execution_count": 15,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
@@ -719,12 +692,12 @@
|
||||
"source": [
|
||||
"# [Day 4](https://adventofcode.com/2025/day/4): Gemini 3 Pro\n",
|
||||
"\n",
|
||||
"Gemini produced a solution to Part 1 that is straightforward and efficient, although perhpas could use some abstraction (e.g. if they had a function to count neighbors, they wouldn't need the `continue`)."
|
||||
"Gemini produced a solution to Part 1 that is straightforward and efficient, although perhaps could use some abstraction (e.g. if they had a function to count neighbors, they wouldn't need the `continue`)."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 75,
|
||||
"execution_count": 16,
|
||||
"id": "35bf1f30-07c7-4842-a6e3-e33fb874e779",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
@@ -780,22 +753,22 @@
|
||||
"id": "1e12bc4c-8cc8-4c01-b4ad-5392b49642e6",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Here I get the input and verify that the code does produce the correct answer:"
|
||||
"Here I verify that the code does produce the correct answer:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 82,
|
||||
"execution_count": 17,
|
||||
"id": "5b54c152-ce26-4baf-8b51-d4a166c6c2e7",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"Puzzle 4.1: .0163 seconds, answer 1569 ok"
|
||||
"Puzzle 4.1: .0171 seconds, answer 1569 ok"
|
||||
]
|
||||
},
|
||||
"execution_count": 82,
|
||||
"execution_count": 17,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
@@ -817,7 +790,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 83,
|
||||
"execution_count": 18,
|
||||
"id": "16a1a0db-7501-41fd-a606-87fbb79273bd",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
@@ -878,17 +851,17 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 79,
|
||||
"execution_count": 19,
|
||||
"id": "b47c2e05-978a-4b22-aafc-e31ee1825387",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"Puzzle 4.2: .4017 seconds, answer 9280 ok"
|
||||
"Puzzle 4.2: .3960 seconds, answer 9280 ok"
|
||||
]
|
||||
},
|
||||
"execution_count": 79,
|
||||
"execution_count": 19,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
@@ -912,7 +885,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 86,
|
||||
"execution_count": 20,
|
||||
"id": "71bfe887-fbd4-4378-b37f-d0b88f9fa3e7",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
@@ -987,17 +960,17 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 88,
|
||||
"execution_count": 21,
|
||||
"id": "f370ee38-67af-42a6-9ad3-cdeec2019ff3",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"Puzzle 4.2: .0761 seconds, answer 9280 ok"
|
||||
"Puzzle 4.2: .0937 seconds, answer 9280 ok"
|
||||
]
|
||||
},
|
||||
"execution_count": 88,
|
||||
"execution_count": 21,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
@@ -1009,15 +982,454 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "8aa26008-a652-4860-9c84-5ba4344d32f3",
|
||||
"id": "78434cfe-d728-453c-8f45-fc6b5fea18c3",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Summary"
|
||||
"# [Day 5](https://adventofcode.com/2025/day/5): Claude Opus 4.5\n",
|
||||
"\n",
|
||||
"Claude produces a straightforward program that solves Part 1 just fine and demonstrates good use of abstraction."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 101,
|
||||
"execution_count": 22,
|
||||
"id": "e7ab7dac-9686-4a76-b83f-6779275c3519",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"\"\"\"\n",
|
||||
"Solution for the Fresh Ingredients puzzle.\n",
|
||||
"Parses a database with fresh ingredient ID ranges and available ingredient IDs,\n",
|
||||
"then counts how many available ingredients are fresh.\n",
|
||||
"\"\"\"\n",
|
||||
"\n",
|
||||
"def parse_input(input_text):\n",
|
||||
" \"\"\"Parse the input into ranges and ingredient IDs.\"\"\"\n",
|
||||
" parts = input_text.strip().split('\\n\\n')\n",
|
||||
" \n",
|
||||
" # Parse fresh ranges\n",
|
||||
" ranges = []\n",
|
||||
" for line in parts[0].strip().split('\\n'):\n",
|
||||
" start, end = map(int, line.split('-'))\n",
|
||||
" ranges.append((start, end))\n",
|
||||
" \n",
|
||||
" # Parse available ingredient IDs\n",
|
||||
" ingredient_ids = []\n",
|
||||
" for line in parts[1].strip().split('\\n'):\n",
|
||||
" ingredient_ids.append(int(line))\n",
|
||||
" \n",
|
||||
" return ranges, ingredient_ids\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"def is_fresh(ingredient_id, ranges):\n",
|
||||
" \"\"\"Check if an ingredient ID falls within any fresh range.\"\"\"\n",
|
||||
" for start, end in ranges:\n",
|
||||
" if start <= ingredient_id <= end:\n",
|
||||
" return True\n",
|
||||
" return False\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"def count_fresh_ingredients(ranges, ingredient_ids):\n",
|
||||
" \"\"\"Count how many ingredient IDs are fresh.\"\"\"\n",
|
||||
" count = 0\n",
|
||||
" for ingredient_id in ingredient_ids:\n",
|
||||
" if is_fresh(ingredient_id, ranges):\n",
|
||||
" count += 1\n",
|
||||
" return count\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"def solve(input_text):\n",
|
||||
" \"\"\"Solve the puzzle and return the count of fresh ingredients.\"\"\"\n",
|
||||
" ranges, ingredient_ids = parse_input(input_text)\n",
|
||||
" return count_fresh_ingredients(ranges, ingredient_ids)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "40fc662b-ecee-4abe-a23a-ca7786edd438",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Here I feed the input to `solve` and verify that the answer is correct:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 23,
|
||||
"id": "49f0bb9c-00c0-4983-ab26-d1cec1e692ac",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"Puzzle 5.1: .0058 seconds, answer 635 ok"
|
||||
]
|
||||
},
|
||||
"execution_count": 23,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"input_text = get_text(5)\n",
|
||||
"\n",
|
||||
"answer(5.1, 635, lambda:\n",
|
||||
" solve(input_text))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "dabe3fbc-0fa6-46d0-adfb-f6413f3a63d5",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"For Part 2, Claude chose to sort ranges and then merge them. That's a reasonable approach."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 24,
|
||||
"id": "b70269fb-3a0b-4728-9d60-421e3b35a877",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"\"\"\"\n",
|
||||
"Solution for the Fresh Ingredients puzzle - Part 2.\n",
|
||||
"Count all unique ingredient IDs that are considered fresh by any range.\n",
|
||||
"\"\"\"\n",
|
||||
"\n",
|
||||
"def parse_input(input_text):\n",
|
||||
" \"\"\"Parse the input to get fresh ranges (ignore ingredient IDs section).\"\"\"\n",
|
||||
" parts = input_text.strip().split('\\n\\n')\n",
|
||||
" \n",
|
||||
" # Parse fresh ranges (only first section matters now)\n",
|
||||
" ranges = []\n",
|
||||
" for line in parts[0].strip().split('\\n'):\n",
|
||||
" start, end = map(int, line.split('-'))\n",
|
||||
" ranges.append((start, end))\n",
|
||||
" \n",
|
||||
" return ranges\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"def merge_ranges(ranges):\n",
|
||||
" \"\"\"Merge overlapping ranges to count unique IDs efficiently.\"\"\"\n",
|
||||
" if not ranges:\n",
|
||||
" return []\n",
|
||||
" \n",
|
||||
" # Sort ranges by start value\n",
|
||||
" sorted_ranges = sorted(ranges)\n",
|
||||
" \n",
|
||||
" merged = [sorted_ranges[0]]\n",
|
||||
" \n",
|
||||
" for current_start, current_end in sorted_ranges[1:]:\n",
|
||||
" last_start, last_end = merged[-1]\n",
|
||||
" \n",
|
||||
" # Check if current range overlaps or is adjacent to the last merged range\n",
|
||||
" if current_start <= last_end + 1:\n",
|
||||
" # Merge by extending the end if needed\n",
|
||||
" merged[-1] = (last_start, max(last_end, current_end))\n",
|
||||
" else:\n",
|
||||
" # No overlap, add as new range\n",
|
||||
" merged.append((current_start, current_end))\n",
|
||||
" \n",
|
||||
" return merged\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"def count_fresh_ids(ranges):\n",
|
||||
" \"\"\"Count total unique fresh ingredient IDs.\"\"\"\n",
|
||||
" merged = merge_ranges(ranges)\n",
|
||||
" \n",
|
||||
" total = 0\n",
|
||||
" for start, end in merged:\n",
|
||||
" # Count is inclusive: end - start + 1\n",
|
||||
" total += end - start + 1\n",
|
||||
" \n",
|
||||
" return total\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"def solve(input_text):\n",
|
||||
" \"\"\"Solve part 2 and return the count of all fresh ingredient IDs.\"\"\"\n",
|
||||
" ranges = parse_input(input_text)\n",
|
||||
" return count_fresh_ids(ranges)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 25,
|
||||
"id": "bc525b8f-8c03-4ac1-a52d-1ba2d8c1e5bb",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"Puzzle 5.2: .0002 seconds, answer 369761800782619 ok"
|
||||
]
|
||||
},
|
||||
"execution_count": 25,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"answer(5.2, 369761800782619, lambda:\n",
|
||||
" solve(input_text))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "b1503029-3a5f-4949-8502-75b051f78a23",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# [Day 6](https://adventofcode.com/2025/day/6): ChatGPT 5.1 Auto\n",
|
||||
"\n",
|
||||
"ChatGPT produced a program that is correct, but has poor abstraction, with one long 63-line function. (It also contains a pet peeve of mine: the pattern \"`if some_boolean: True else: False`\" can always be replaced with \"`some_boolean`\".)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 26,
|
||||
"id": "fc4e8201-532f-47e8-aa95-e5dad13b8f4f",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import re\n",
|
||||
"from functools import reduce\n",
|
||||
"from operator import add, mul\n",
|
||||
"\n",
|
||||
"def solve(lines):\n",
|
||||
" if not lines:\n",
|
||||
" return 0\n",
|
||||
"\n",
|
||||
" # Preserve internal spaces; pad all lines to same width\n",
|
||||
" width = max(len(line) for line in lines)\n",
|
||||
" grid = [line.rstrip('\\n').ljust(width) for line in lines]\n",
|
||||
" h = len(grid)\n",
|
||||
"\n",
|
||||
" # Identify separator columns: columns that are all spaces\n",
|
||||
" sep = []\n",
|
||||
" for c in range(width):\n",
|
||||
" if all(grid[r][c] == ' ' for r in range(h)):\n",
|
||||
" sep.append(True)\n",
|
||||
" else:\n",
|
||||
" sep.append(False)\n",
|
||||
"\n",
|
||||
" # Find ranges of columns that form individual problems\n",
|
||||
" problem_ranges = []\n",
|
||||
" in_problem = False\n",
|
||||
" start = 0\n",
|
||||
" for c in range(width):\n",
|
||||
" if not sep[c]:\n",
|
||||
" if not in_problem:\n",
|
||||
" in_problem = True\n",
|
||||
" start = c\n",
|
||||
" else:\n",
|
||||
" if in_problem:\n",
|
||||
" in_problem = False\n",
|
||||
" problem_ranges.append((start, c - 1))\n",
|
||||
" if in_problem:\n",
|
||||
" problem_ranges.append((start, width - 1))\n",
|
||||
"\n",
|
||||
" grand_total = 0\n",
|
||||
"\n",
|
||||
" # Process each problem\n",
|
||||
" for c_start, c_end in problem_ranges:\n",
|
||||
" numbers = []\n",
|
||||
"\n",
|
||||
" # All rows except the last contain numbers (potentially)\n",
|
||||
" for r in range(h - 1):\n",
|
||||
" segment = grid[r][c_start:c_end + 1]\n",
|
||||
" for match in re.finditer(r'\\d+', segment):\n",
|
||||
" numbers.append(int(match.group(0)))\n",
|
||||
"\n",
|
||||
" # Last row contains the operator (+ or *)\n",
|
||||
" bottom_segment = grid[h - 1][c_start:c_end + 1]\n",
|
||||
" if '+' in bottom_segment:\n",
|
||||
" op = add\n",
|
||||
" elif '*' in bottom_segment:\n",
|
||||
" op = mul\n",
|
||||
" else:\n",
|
||||
" # No valid operator found; skip this malformed problem\n",
|
||||
" continue\n",
|
||||
"\n",
|
||||
" if not numbers:\n",
|
||||
" continue\n",
|
||||
"\n",
|
||||
" # Fold numbers using the chosen operator\n",
|
||||
" result = reduce(op, numbers)\n",
|
||||
" grand_total += result\n",
|
||||
"\n",
|
||||
" return grand_total"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "54e76b17-4372-4074-b7f9-9439c44bc3c4",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Here I interface my input data to ChatGPT's function and verify it is correct:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 27,
|
||||
"id": "d6beae51-7205-4cae-b4d7-495da2953baf",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"Puzzle 6.1: .0078 seconds, answer 5877594983578 ok"
|
||||
]
|
||||
},
|
||||
"execution_count": 27,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"lines = get_text(6).splitlines()\n",
|
||||
"\n",
|
||||
"answer(6.1, 5877594983578, lambda:\n",
|
||||
" solve(lines))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "28de0612-03f3-4587-88cb-18ef9b10ee00",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"For Part 2 we get an even-longer function, with even more complications. For example in \"Determine the operator\"; the 14 lines could be replaced with `op = (add if '+' in segment else mul)`. However, the program is correct."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 28,
|
||||
"id": "053c3971-9aa5-490a-a016-f303b391c5d8",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import re\n",
|
||||
"from functools import reduce\n",
|
||||
"from operator import add, mul\n",
|
||||
"\n",
|
||||
"def solve(lines):\n",
|
||||
" if not lines:\n",
|
||||
" return 0\n",
|
||||
"\n",
|
||||
" # Normalize lines to a rectangle, preserving internal spaces\n",
|
||||
" width = max(len(line) for line in lines)\n",
|
||||
" grid = [line.rstrip('\\n').ljust(width) for line in lines]\n",
|
||||
" h = len(grid)\n",
|
||||
"\n",
|
||||
" # A separator column is all spaces -> separates problems\n",
|
||||
" sep = []\n",
|
||||
" for c in range(width):\n",
|
||||
" if all(grid[r][c] == ' ' for r in range(h)):\n",
|
||||
" sep.append(True)\n",
|
||||
" else:\n",
|
||||
" sep.append(False)\n",
|
||||
"\n",
|
||||
" # Find contiguous ranges of non-separator columns (each is a problem)\n",
|
||||
" problem_ranges = []\n",
|
||||
" in_problem = False\n",
|
||||
" start = 0\n",
|
||||
" for c in range(width):\n",
|
||||
" if not sep[c]:\n",
|
||||
" if not in_problem:\n",
|
||||
" in_problem = True\n",
|
||||
" start = c\n",
|
||||
" else:\n",
|
||||
" if in_problem:\n",
|
||||
" in_problem = False\n",
|
||||
" problem_ranges.append((start, c - 1))\n",
|
||||
" if in_problem:\n",
|
||||
" problem_ranges.append((start, width - 1))\n",
|
||||
"\n",
|
||||
" grand_total = 0\n",
|
||||
"\n",
|
||||
" for c_start, c_end in problem_ranges:\n",
|
||||
" # Find the operator row for THIS problem slice:\n",
|
||||
" # bottom-most row with + or * in [c_start, c_end]\n",
|
||||
" op_row = None\n",
|
||||
" for r in range(h):\n",
|
||||
" segment = grid[r][c_start:c_end + 1]\n",
|
||||
" if '+' in segment or '*' in segment:\n",
|
||||
" op_row = r\n",
|
||||
"\n",
|
||||
" if op_row is None:\n",
|
||||
" # No operator for this vertical slice -> skip\n",
|
||||
" continue\n",
|
||||
"\n",
|
||||
" # Determine the operator\n",
|
||||
" op_segment = grid[op_row][c_start:c_end + 1]\n",
|
||||
" if '+' in op_segment and '*' in op_segment:\n",
|
||||
" # If both appear (shouldn't in valid input), choose the leftmost\n",
|
||||
" plus_idx = op_segment.find('+') if '+' in op_segment else float('inf')\n",
|
||||
" mul_idx = op_segment.find('*') if '*' in op_segment else float('inf')\n",
|
||||
" op = add if plus_idx < mul_idx else mul\n",
|
||||
" elif '+' in op_segment:\n",
|
||||
" op = add\n",
|
||||
" elif '*' in op_segment:\n",
|
||||
" op = mul\n",
|
||||
" else:\n",
|
||||
" # No operator actually present\n",
|
||||
" continue\n",
|
||||
"\n",
|
||||
" # Build numbers column-by-column (each column is one number),\n",
|
||||
" # reading digits from top to just above op_row\n",
|
||||
" numbers = []\n",
|
||||
" for c in range(c_start, c_end + 1):\n",
|
||||
" digits = []\n",
|
||||
" for r in range(op_row):\n",
|
||||
" ch = grid[r][c]\n",
|
||||
" if ch.isdigit():\n",
|
||||
" digits.append(ch)\n",
|
||||
" if digits:\n",
|
||||
" num = int(''.join(digits))\n",
|
||||
" numbers.append(num)\n",
|
||||
"\n",
|
||||
" if not numbers:\n",
|
||||
" continue\n",
|
||||
"\n",
|
||||
" result = reduce(op, numbers)\n",
|
||||
" grand_total += result\n",
|
||||
"\n",
|
||||
" return grand_total"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 29,
|
||||
"id": "e37018e7-0c82-4501-830e-dfe55efc3ad6",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"Puzzle 6.2: .0064 seconds, answer 11159825706149 ok"
|
||||
]
|
||||
},
|
||||
"execution_count": 29,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"answer(6.2, 11159825706149, lambda:\n",
|
||||
" solve(lines))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "8aa26008-a652-4860-9c84-5ba4344d32f3",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Summary\n",
|
||||
"\n",
|
||||
"The LLMs got all the problems right (with only a little extra prompting), and the programs are all reasonably efficient (the Day 2 programs could be faster). The readability of the code varies."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 30,
|
||||
"id": "2d03c39d-42f5-4f51-89b9-638d8d4a4a35",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
@@ -1025,29 +1437,36 @@
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Puzzle 1.1: .0014 seconds, answer 1182 ok\n",
|
||||
"Puzzle 1.2: .0015 seconds, answer 6907 ok\n",
|
||||
"Puzzle 2.1: .1303 seconds, answer 23560874270 ok\n",
|
||||
"Puzzle 2.2: .1816 seconds, answer 44143124633 ok\n",
|
||||
"Puzzle 3.1: .0040 seconds, answer 17085 ok\n",
|
||||
"Puzzle 3.2: .0069 seconds, answer 169408143086082 ok\n",
|
||||
"Puzzle 4.1: .0163 seconds, answer 1569 ok\n",
|
||||
"Puzzle 4.2: .0761 seconds, answer 9280 ok\n"
|
||||
"Puzzle 1.1: .0013 seconds, answer 1182 ok\n",
|
||||
"Puzzle 1.2: .0024 seconds, answer 6907 ok\n",
|
||||
"Puzzle 2.1: .1263 seconds, answer 23560874270 ok\n",
|
||||
"Puzzle 2.2: .1350 seconds, answer 44143124633 ok\n",
|
||||
"Puzzle 3.1: .0044 seconds, answer 17085 ok\n",
|
||||
"Puzzle 3.2: .0076 seconds, answer 169408143086082 ok\n",
|
||||
"Puzzle 4.1: .0171 seconds, answer 1569 ok\n",
|
||||
"Puzzle 4.2: .0937 seconds, answer 9280 ok\n",
|
||||
"Puzzle 5.1: .0058 seconds, answer 635 ok\n",
|
||||
"Puzzle 5.2: .0002 seconds, answer 369761800782619 ok\n",
|
||||
"Puzzle 6.1: .0078 seconds, answer 5877594983578 ok\n",
|
||||
"Puzzle 6.2: .0064 seconds, answer 11159825706149 ok\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"0.40801501274108887"
|
||||
]
|
||||
},
|
||||
"execution_count": 30,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"for d in sorted(answers):\n",
|
||||
" print(answers[d])"
|
||||
" print(answers[d])\n",
|
||||
"sum(a.secs for a in answers.values())"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "d0e8b776-455d-405b-9370-2443daddee9b",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
@@ -15,7 +15,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 7,
|
||||
"execution_count": 21,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
@@ -119,9 +119,20 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 9,
|
||||
"execution_count": 18,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"((9, 5), (123, 456))"
|
||||
]
|
||||
},
|
||||
"execution_count": 18,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"Char = str # Intended as the type of a one-character string\n",
|
||||
"Atom = Union[str, float, int] # The type of a string or number\n",
|
||||
@@ -211,7 +222,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 11,
|
||||
"execution_count": 14,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
|
||||
Reference in New Issue
Block a user