{ "cells": [ { "cell_type": "markdown", "id": "af842d66-6b62-45e0-8ff3-3222494fbaed", "metadata": {}, "source": [ "
Peter Norvig
Sept 25, 2024
Update May 21, 2025
\n", "\n", "# LLMs, Theory of Mind, and Cheryl's Birthday\n", "\n", "There has been [much](https://spectrum.ieee.org/theory-of-mind-ai) [debate](https://aclanthology.org/2023.conll-1.25/) [on](https://www.gsb.stanford.edu/faculty-research/working-papers/theory-mind-may-have-spontaneously-emerged-large-language-models) [the](https://arxiv.org/abs/2302.02083) [degree](https://www.nature.com/articles/s41562-024-01882-z) to which Large Language Models (LLMs) have a [theory of mind](https://en.wikipedia.org/wiki/Theory_of_mind): a way of understanding what other people know and don't know. In this notebook I explore one small part of the issue by asking nine LLMs to solve the [Cheryl's Birthday Problem](https://en.wikipedia.org/wiki/Cheryl%27s_Birthday), a well-known logic puzzle in which different characters have different states of knowledge at different times. The LLMs were all familiar with the puzzle, so if I just asked them to solve it, they all would just recall the answer they have memorized, \"July 16.\" Instead I test the LLMs with the following two prompts, one a novel formulation that alters the dates and wording, and one that requires writing a general program that works for any set of dates:\n", "\n", "## Reasoning Test\n", "\n", "1. **Ali and Bo are friends with Cam. Cam told them that her anniversary is one of 10 possible dates:**\n", " - **April 17, April 18, April 28, July 16, July 17, July 19, June 16, June 29, March 18, March 19**\n", "3. **Cam then privately tells Ali the month and Bo the day number of the anniversary.**\n", "4. **Ali: \"I don't know when Cam’s anniversary is, and I know that Bo does not know it either.\"**\n", "5. **Bo: \"At first I didn't know when Cam’s anniversary was, but I know now, after Ali's statement.\"**\n", "6. **Ali: \"Then I also know when Cam’s anniversary is.\"**\n", "7. **When is Cam’s anniversary?**\n", "\n", "## Programming Test\n", "\n", "- **What is the answer to the \"Cheryl's Birthday\" problem? Write a Python program to solve it. Make sure that the program will still work if the list of possible dates is changed.**\n", "\n", "\n", "## Results\n", "\n", "The correct answer for the reasoning test is **March 18**, and for the programming test, **July 16**.\n", "\n", "In **September 2024**, all of the LLMs failed both tests. They couldn't distinguish the different knowledge states of Alfred and Bernard. At least with respect to this problem, they had a poorly developed theory of mind. \n", "\n", "But in **May 2025**, all the LLMs had new versions, and three passed the programming test while four passed the reasoning test:\n", "\n", "|2024
Model/version|2025
version|2024
Program|2024
Reasoning|2025
Program|2025
Reasoning|\n", "|--|--|:--:|:--:|:-:|:--:|\n", "|[A human](https://github.com/norvig/)|older human|✅|✅|✅|✅|\n", "|[Gemini Advanced](https://gemini.google.com/app)|2.5 Pro|❌ |❌|✅ |✅|\n", "|[You.com](https://you.com/)|Compute|❌ |❌|✅ |✅|\n", "|[Anthropic Claude 3.5 Sonnet](https://claude.ai/new)|Sonnet 4|❌ |❌|✅ |❌ |\n", "|[ChatGPT 4o](https://chatgpt.com/)|o4-mini-high|❌ |❌|❌ |✅|\n", "|[Perplexity](https://www.perplexity.ai/)|Pro|❌ |❌|❌ |✅|\n", "|[Microsoft Copilot](https://copilot.microsoft.com)|Copilot|❌|❌ |❌|❌|\n", "|[Meta AI Llama 405B](https://www.meta.ai/)|AI|❌ |❌|❌ |❌|\n", "|[Cohere Chat](https://cohere.com/chat)|Command A|❌ |❌|❌ |❌|\n", "|[HuggingFace Chat](https://huggingface.co/chat/)|v0.9.4|❌|❌ |❌|❌|" ] }, { "cell_type": "markdown", "id": "cc9df830-5dcb-4f67-99a8-870f30653c64", "metadata": {}, "source": [ "# 2024 and 2025: Reasoning\n", "\n", "As mentioned above, I created a reworded version of the puzzle and used it as a prompt:\n", "\n", "1. **Ali and Bo are friends with Cam. Cam told them that her anniversary is one of 10 possible dates:**\n", " - **April 17, April 18, April 28, July 16, July 17, July 19, June 16, June 29, March 18, March 19**\n", "3. **Cam then privately tells Ali the month and Bo the day number of the anniversary.**\n", "4. **Ali: \"I don't know when Cam’s anniversary is, and I know that Bo does not know it either.\"**\n", "5. **Bo: \"At first I didn't know when Cam’s anniversary was, but I know now, after Ali's statement.\"**\n", "6. **Ali: \"Then I also know when Cam’s anniversary is.\"**\n", "7. **When is Cam’s anniversary?**\n", "\n", "(The correct answer is **March 18**.)\n", "\n", "In **2024**, all the LLMs were generally headed in the right direction in their reasoning, but all made mistakes. For example, Claude says \"*Bo hears the day and realizes after Ali's statement. Since Bo did not initially know the date, the day number Bo heard must appear in more than one month. Therefore, the days 16, 18, and 19 must be eliminated since they have corresponding unique months.*\" But that's just not right; they don't have unique months. As it turns out, [http://you.com](you.com) did get the right answer on the first try. But some of the reasoning steps were wrong, so I re-ran it, and it failed the second time. Thus, all the LLMs failed in 2024. \n", "\n", "In **2025**, Gemini, You.com, ChatGPT, and Perplexity got the right answer, with well-reasoned steps. The others made mistakes similar to the 2024 versions. I'll only show one example of reasoning, by Gemini 2.5 Pro:\n" ] }, { "cell_type": "markdown", "id": "55db0973-3189-44bb-8503-d61a1e01db68", "metadata": {}, "source": [ "## Gemini Pro 2.5 \n", "\n", "\n", "This is a classic logic puzzle that can be solved by systematically eliminating possibilities based on the statements made by Ali and Bo.\n", "\n", "1. The Initial Possibilities:\n", "\n", "Let's list the dates by month:\n", "\n", "- March: 18, 19\n", "- April: 17, 18, 28\n", "- June: 16, 29\n", "- July: 16, 17, 19\n", "\n", "2. Ali's First Statement: \"I don't know when Cam’s anniversary is, and I know that Bo does not know it either.\"\n", "\n", "\"I don't know...\": This tells us that the month Ali was told must have more than one possible date. All four months (March, April, June, July) have multiple possibilities, so this part doesn't eliminate anything yet.\n", "\"...and I know that Bo does not know it either.\": This is the crucial part. If Ali knows Bo doesn't know, it means that no matter which day is possible in her month, that day number must also appear in another month. If there was a day number unique to her month, she couldn't be sure Bo didn't know.\n", "Let's look for unique day numbers in the original list:\n", "\n", "- 28 appears only in April.\n", "- 29 appears only in June.\n", "\n", "If Ali was told April, she couldn't be sure Bo didn't know, because if the date was April 28, Bo (having '28') would know immediately.\n", "If Ali was told June, she couldn't be sure Bo didn't know, because if the date was June 29, Bo (having '29') would know immediately.\n", "Therefore, Ali must have been told a month where all possible days also appear in other months. This eliminates April and June.\n", "\n", "3. The Remaining Possibilities (after Ali's first statement):\n", "\n", "- March: 18, 19\n", "- July: 16, 17, 19\n", "\n", "4. Bo's Statement: \"At first I didn't know when Cam’s anniversary was, but I know now, after Ali's statement.\"\n", "\n", "Bo now knows the date. This means that after eliminating April and June, his day number must now point to only one possible date among the remaining options (March 18, 19; July 16, 17, 19).\n", "Let's check the day numbers in the remaining list:\n", "- 16: Only appears in July 16. If Bo had 16, he would know.\n", "- 17: Only appears in July 17. If Bo had 17, he would know.\n", "- 18: Only appears in March 18. If Bo had 18, he would know.\n", "- 19: Appears in March 19 and July 19. If Bo had 19, he would still not know.\n", "\n", "Since Bo now knows, his day number cannot be 19. This eliminates March 19 and July 19.\n", "\n", "5. The Remaining Possibilities (after Bo's statement):\n", "\n", "- March: 18\n", "- July: 16, 17\n", "\n", "6. Ali's Second Statement: \"Then I also know when Cam’s anniversary is.\"\n", "\n", "Ali knows the month. She now knows the only possibilities are March 18, July 16, and July 17.\n", "For her to know the exact date, her month must only have one option left in this list.\n", "If her month was July, the options would be July 16 and July 17 – she wouldn't know.\n", "If her month was March, the only option is March 18 – she would know.\n", "Since Ali now knows, her month must be March.\n", "Conclusion:\n", "\n", "The only remaining possibility that fits all the statements is March 18. Therefore, Cam's anniversary is March 18." ] }, { "cell_type": "markdown", "id": "3028427a-7368-4992-8b48-9d70dd42959f", "metadata": {}, "source": [ "# 2024: Programming\n", "\n", "None of the LLMs produced a correct program in 2024. Each LLM provided explanatory output along with a program; for brevity I only show the explanatory output from the first LLM, ChatGPT 4o, but I'll show the code for all hte LLMs. My comments in *[bracketed italics]* explain where they went wrong." ] }, { "cell_type": "markdown", "id": "c3e532bf-67b0-4976-aa5f-730401a4ef07", "metadata": {}, "source": [ "## Human\n", "\n", "An actual human programmer (me) was able to write a program, shown in [**another notebook**](https://github.com/norvig/pytudes/blob/main/ipynb/Cheryl-and-Eve.ipynb), that correctly solves the original problem and also handles different sets of dates. I introduced the idea of a *BeliefState*, a set of possible dates that a person believes might possibly be the birthday, and I modeled a character's *statement* as a function that takes a particular date as input, and returns true if the date is consistent with the statement." ] }, { "cell_type": "markdown", "id": "55f6744e-356c-4fd8-af26-e7559618dd8d", "metadata": {}, "source": [ "## ChatGPT 4o\n", "\n", "*[ChatGPT 4o gives the following response, which starts with an excellent summary of the problem:]*\n", "\n", "The \"Cheryl's Birthday\" problem is a well-known logic puzzle that can be solved using logical deduction. Let's first break down the problem before writing a Python program.\n", "\n", "### Problem Statement\n", "Cheryl gives Albert and Bernard a list of 10 possible birthdates, and tells them different information:\n", "- Albert knows the month.\n", "- Bernard knows the day.\n", "\n", "Cheryl provides the following 10 possible dates:\n", "- May 15, May 16, May 19\n", "- June 17, June 18\n", "- July 14, July 16\n", "- August 14, August 15, August 17\n", "\n", "Albert then says: \"I don't know when Cheryl's birthday is, but I know that Bernard doesn't know either.\"\n", "\n", "Bernard replies: \"At first I didn’t know when Cheryl's birthday is, but now I know.\"\n", "\n", "Albert then says: \"Now I know when Cheryl's birthday is.\"\n", "\n", "### Steps for Deduction\n", "1. From Albert's statement, we can deduce that Cheryl's birthday is not in May or June, because if it were, Bernard could possibly know the exact birthday if the day was unique (May 19 or June 18). Since Albert is sure Bernard doesn't know, we can eliminate May and June.\n", "2. After eliminating May and June, the remaining possible dates are:\n", " - July 14, July 16\n", " - August 14, August 15, August 17\n", "3. From Bernard’s statement, now knowing the birthday, we can deduce that Cheryl's birthday isn't on July 14 or August 14, because if it were, Bernard would still be uncertain (since both dates share the day 14).\n", "4. The remaining possible dates are:\n", " - July 16\n", " - August 15\n", " - August 17\n", "5. Finally, from Albert’s statement, we deduce that Cheryl's birthday must be in July, because if it were in August, Albert wouldn’t be able to know the exact date. Thus, the birthday is July 16.\n", "\n", "### Python Program to Solve the Problem\n", "We can now write a Python program to generalize this solution and ensure it works for any list of possible dates.\n", "\n", "### Explanation:\n", "- Step 1: The function groups the dates by month and by day.\n", "- Step 2: Based on Albert's statement, we eliminate the months of May and June.\n", "- Step 3: Based on Bernard's statement, we eliminate dates where the day appears more than once (i.e., both July 14 and August 14 are eliminated).\n", "- Step 4: Albert now knows Cheryl's birthday, so it must be in the month where only one date is left (i.e., July).\n", "- \n", "The program is designed to work with any given list of possible dates, not just the ones in the original problem." ] }, { "cell_type": "code", "execution_count": 1, "id": "03ff513c-d390-413e-b524-9c82de44e1a6", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Cheryl's birthday is: No solution found\n" ] } ], "source": [ "def filter_dates(possible_dates):\n", " # Step 1: Group dates by month and day\n", " month_map = {}\n", " day_map = {}\n", " \n", " for date in possible_dates:\n", " month, day = date.split()\n", " day = int(day)\n", " \n", " # Group by month\n", " if month not in month_map:\n", " month_map[month] = []\n", " month_map[month].append(day)\n", " \n", " # Group by day\n", " if day not in day_map:\n", " day_map[day] = []\n", " day_map[day].append(month)\n", " \n", " # Step 2: Albert says: \"I don't know when Cheryl's birthday is, but I know Bernard doesn't know either.\"\n", " # This implies Cheryl's birthday cannot be in May or June because these months have unique days (19th and 18th).\n", " possible_dates = [date for date in possible_dates if date.split()[0] not in ('May', 'June')]\n", " \n", " # Step 3: Bernard now knows the birthday, so it cannot be a day that appears more than once.\n", " possible_dates = [date for date in possible_dates if len(day_map[int(date.split()[1])]) == 1]\n", " \n", " # Step 4: Albert now knows the birthday, so it must be in a unique month remaining after Bernard's elimination.\n", " remaining_months = set(date.split()[0] for date in possible_dates)\n", " for date in possible_dates:\n", " month = date.split()[0]\n", " if len(month_map[month]) == 1:\n", " return date\n", " \n", " return \"No solution found\"\n", "\n", "# Test the program with the given list of possible dates\n", "possible_dates = [\n", " \"May 15\", \"May 16\", \"May 19\",\n", " \"June 17\", \"June 18\",\n", " \"July 14\", \"July 16\",\n", " \"August 14\", \"August 15\", \"August 17\"\n", "]\n", "\n", "cheryls_birthday = filter_dates(possible_dates)\n", "print(\"Cheryl's birthday is:\", cheryls_birthday)" ] }, { "cell_type": "markdown", "id": "cb513413-f71d-476d-a6bc-211ab1589bbe", "metadata": {}, "source": [ "*[We see that the program fails to find a solution. It starts with the correct list of \"possible_dates\", and correctly updates the list twice to reflect Albert's and Bernard's first statements. So far so good. But then things go wrong: first it defines \"remaining_months\" and never uses it. Then it tries to find a possible date in a unique month, but it should be a unique month with respect to what is now known, not with respect to the original possible dates stored in the \"month_map\". The program has failed to track the two character's different belief states over time.]* " ] }, { "cell_type": "markdown", "id": "93852dab-d167-4d14-8cc6-02979427eb01", "metadata": {}, "source": [ "## Microsoft Copilot\n", "\n", "*[Copilot makes a similar mistake.*]" ] }, { "cell_type": "code", "execution_count": 2, "id": "5d7e4408-3663-4f54-acdb-4bddf1ac9121", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Cheryl's birthday is: None\n" ] } ], "source": [ "def cheryls_birthday(possible_dates):\n", " # Step 1: Filter out months with unique days\n", " unique_days = {date.split()[1] for date in possible_dates if sum(d.split()[1] == date.split()[1] \n", " for d in possible_dates) == 1}\n", " possible_dates = [date for date in possible_dates if date.split()[1] not in unique_days]\n", "\n", " # Step 2: Filter out dates with unique months\n", " unique_months = {date.split()[0] for date in possible_dates if sum(d.split()[0] == date.split()[0] \n", " for d in possible_dates) == 1}\n", " possible_dates = [date for date in possible_dates if date.split()[0] not in unique_months]\n", "\n", " # Step 3: Filter out remaining dates based on the logic of the puzzle\n", " for date in possible_dates:\n", " month, day = date.split()\n", " if all(d.split()[0] != month for d in possible_dates if d != date):\n", " return date\n", "\n", "# List of possible dates\n", "possible_dates = [\n", " \"May 15\", \"May 16\", \"May 19\",\n", " \"June 17\", \"June 18\",\n", " \"July 14\", \"July 16\",\n", " \"August 14\", \"August 15\", \"August 17\"\n", "]\n", "\n", "birthday = cheryls_birthday(possible_dates)\n", "print(f\"Cheryl's birthday is: {birthday}\")" ] }, { "cell_type": "markdown", "id": "d981531c-31f6-442e-b210-d0bff93d49bc", "metadata": {}, "source": [ "## Gemini Advanced\n", "\n", "*[Gemini makes a similar mistake.]*\n" ] }, { "cell_type": "code", "execution_count": 3, "id": "7c0bd1e7-4ada-4b43-97a5-674f615b9f77", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Unable to determine Cheryl's birthday\n", "With the new set of dates, unable to determine Cheryl's birthday\n" ] } ], "source": [ "def cheryl_birthday(possible_dates):\n", " \"\"\"Solves the Cheryl's Birthday logic problem, allowing for a variable list of possible dates.\"\"\"\n", "\n", " # Possible months\n", " months = [\"May\", \"June\", \"July\", \"August\"]\n", "\n", " # Albert's statement eliminates some possibilities\n", " # He knows the month, so there must be only one possible date in that month\n", " for month in months:\n", " if len([date for date in possible_dates if (month, date) not in [(\"May\", 19), (\"June\", 18)]]) == 1:\n", " months = [month]\n", " break\n", "\n", " # Bernard's statement further narrows down the possibilities\n", " # He knows the date, so there must be only one possible month for that date\n", " for date in possible_dates:\n", " if len([month for month in months if (month, date) not in [(\"May\", 19), (\"June\", 18)]]) == 1:\n", " possible_dates = [date]\n", " break\n", "\n", " # After Bernard's statement, Albert also knows the answer\n", " # This means there's only one possible date left in the remaining month\n", " for month in months:\n", " if len([date for date in possible_dates if (month, date) not in [(\"May\", 19), (\"June\", 18)]]) == 1:\n", " possible_dates = [date]\n", " break\n", "\n", " # Now there should be only one possibility left\n", " if len(months) == 1 and len(possible_dates) == 1:\n", " return months[0], possible_dates[0]\n", " else:\n", " return None\n", "\n", "# Original problem setup\n", "original_dates = [14, 15, 16, 17, 18, 19]\n", "solution = cheryl_birthday(original_dates)\n", "\n", "if solution:\n", " print(f\"Cheryl's birthday is on {solution[0]} {solution[1]}\")\n", "else:\n", " print(\"Unable to determine Cheryl's birthday\")\n", "\n", "# Example with a different set of possible dates\n", "new_dates = [13, 14, 15, 16, 17]\n", "new_solution = cheryl_birthday(new_dates)\n", "\n", "if new_solution:\n", " print(f\"With the new set of dates, Cheryl's birthday is on {new_solution[0]} {new_solution[1]}\")\n", "else:\n", " print(\"With the new set of dates, unable to determine Cheryl's birthday\")" ] }, { "cell_type": "markdown", "id": "11f9bfc5-8548-4f7a-ac69-acecddb6f19b", "metadata": {}, "source": [ "*[The program accepts a list of \"possible_dates\", but confusingly, those are actually just days of the month; you aren't allowed to pass in different month/day combinations. And even if you do pass in different days, the program hard-codes certain days and months.]*" ] }, { "cell_type": "markdown", "id": "3932858f-170b-4979-a565-c3b41a74e7fe", "metadata": {}, "source": [ "## Meta Llama 405B\n", "\n", "*[Llama also fails to find a solution (and thus has no output). This program is quite flawed. I do like that it defines functions for the various pieces, but note that the function \"bernard_deduce\" is defined but never called.]*" ] }, { "cell_type": "code", "execution_count": 4, "id": "1faff935-74aa-44ee-9f37-15f71521e7d2", "metadata": {}, "outputs": [], "source": [ "# Possible dates for Cheryl's birthday\n", "dates = [\n", " (5, 13), (5, 15), (5, 16), \n", " (6, 13), (6, 15), (6, 16), \n", " (7, 13), (7, 15), (7, 16), \n", " (8, 13), (8, 15), (8, 16)\n", "]\n", "\n", "# Function to check if a date is possible based on Albert's knowledge\n", "def is_possible_albert(month):\n", " return all(m == month for m, _ in dates)\n", "\n", "# Function to check if a date is possible based on Bernard's knowledge\n", "def is_possible_bernard(day):\n", " return any((m, d) == (month, day) for m, d in dates)\n", "\n", "# Function to check if Albert can deduce the birthday\n", "def albert_deduce(month):\n", " return len([d for m, d in dates if m == month]) == 1\n", "\n", "# Function to check if Bernard can deduce the birthday\n", "def bernard_deduce(day):\n", " return len([m for m, d in dates if d == day]) == 1\n", "\n", "# Main program\n", "for month, day in dates:\n", " if not is_possible_albert(month):\n", " continue\n", " if not is_possible_bernard(day):\n", " continue\n", " if albert_deduce(month):\n", " print(f\"Cheryl's birthday is {month}/{day}\")\n", " break" ] }, { "cell_type": "markdown", "id": "02b1f7fd-918c-432d-8784-95dff3d4f86e", "metadata": {}, "source": [ "## Claude 3.5 Sonnet\n", "\n", "*[Claude makes a similar mistake. It deserves credit for explicitly trying an example with different dates, and it correctly reports that the second example has no solution, but that was by accident.]*" ] }, { "cell_type": "code", "execution_count": 5, "id": "1da1319a-95d1-4c98-8b36-d96a8c552cb7", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Cheryl's birthday is on: No unique solution found.\n", "With the new set of dates, Cheryl's birthday is on: No unique solution found.\n" ] } ], "source": [ "def solve_cheryls_birthday(dates):\n", " # Extract unique months and days\n", " months = sorted(set(date.split()[0] for date in dates))\n", " days = sorted(set(int(date.split()[1]) for date in dates))\n", "\n", " # Step 1: Albert knows the month, Bernard knows the day\n", " possible_months = [m for m in months if len([d for d in dates if d.startswith(m)]) > 1]\n", " possible_days = [d for d in days if len([date for date in dates if date.endswith(str(d))]) > 1]\n", "\n", " # Step 2: Albert knows Bernard doesn't know\n", " possible_dates = [date for date in dates if int(date.split()[1]) in possible_days]\n", "\n", " # Step 3: Bernard now knows\n", " possible_dates = [date for date in possible_dates if possible_dates.count(date.split()[1]) == 1]\n", "\n", " # Step 4: Albert now knows\n", " result = [date for date in possible_dates if possible_dates.count(date.split()[0]) == 1]\n", "\n", " return result[0] if result else \"No unique solution found.\"\n", "\n", "# Example usage\n", "dates = [\n", " \"May 15\", \"May 16\", \"May 19\",\n", " \"June 17\", \"June 18\",\n", " \"July 14\", \"July 16\",\n", " \"August 14\", \"August 15\", \"August 17\"\n", "]\n", "\n", "solution = solve_cheryls_birthday(dates)\n", "print(f\"Cheryl's birthday is on: {solution}\")\n", "\n", "# Test with a different set of dates\n", "new_dates = [\n", " \"April 15\", \"April 16\", \"April 25\",\n", " \"May 15\", \"May 16\",\n", " \"June 17\", \"June 18\",\n", " \"July 14\", \"July 16\", \"July 17\"\n", "]\n", "\n", "new_solution = solve_cheryls_birthday(new_dates)\n", "print(f\"With the new set of dates, Cheryl's birthday is on: {new_solution}\")" ] }, { "cell_type": "markdown", "id": "52003af0-0ea1-4ed4-9b5e-5b9b8862a068", "metadata": {}, "source": [ "## Perplexity\n", "\n", "*[Once again, this program has a similar mistake.]*" ] }, { "cell_type": "code", "execution_count": 6, "id": "55e99206-42f8-44ea-a091-1ca6a14f4e89", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Cheryl's birthday is: None\n" ] } ], "source": [ "from collections import defaultdict\n", "\n", "def solve_cheryls_birthday(possible_dates):\n", " months = defaultdict(list)\n", " days = defaultdict(list)\n", " \n", " for date in possible_dates:\n", " month, day = date.split()\n", " months[month].append(date)\n", " days[day].append(date)\n", " \n", " # Albert knows the month, Bernard knows the day\n", " possible = [date for date in possible_dates if len(months[date.split()[0]]) > 1 and len(days[date.split()[1]]) > 1]\n", " \n", " # Bernard now knows the date\n", " possible = [date for date in possible if len([d for d in possible if d.split()[1] == date.split()[1]]) == 1]\n", " \n", " # Albert now knows the date\n", " possible = [date for date in possible if len([d for d in possible if d.split()[0] == date.split()[0]]) == 1]\n", " \n", " return possible[0] if len(possible) == 1 else None\n", "\n", "# Example usage\n", "possible_dates = [\n", " \"May 15\", \"May 16\", \"May 19\",\n", " \"June 17\", \"June 18\",\n", " \"July 14\", \"July 16\",\n", " \"August 14\", \"August 15\", \"August 17\"\n", "]\n", "\n", "result = solve_cheryls_birthday(possible_dates)\n", "print(f\"Cheryl's birthday is: {result}\")" ] }, { "cell_type": "markdown", "id": "4b4e47b5-f5a2-491d-ad34-f8c08aceca59", "metadata": {}, "source": [ "## Cohere Chat Playground\n", "\n", "*[Cohere too has the same mistake.]*" ] }, { "cell_type": "code", "execution_count": 7, "id": "f1c6bfd7-6738-4325-8bc4-f91b4ee15e48", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Cheryl's birthday is: The solution could not be determined.\n" ] } ], "source": [ "def solve_cheryls_birthday(possible_dates):\n", " # Create a dictionary to map months to their respective days\n", " month_days = {\n", " \"May\": [15, 16, 19],\n", " \"June\": [17, 18],\n", " \"July\": [14, 16],\n", " \"August\": [14, 15, 17],\n", " }\n", "\n", " # Albert knows the month, so we can filter out the possible dates based on that\n", " albert_knows = [date for date in possible_dates if date[1] in month_days]\n", "\n", " # Bernard knows the day, so we can further filter based on that\n", " bernard_knows = [date for date in albert_knows if date[2] in month_days[date[1]]]\n", "\n", " # If Bernard knows the birthday, Albert should also know it\n", " if len(bernard_knows) == 1:\n", " return bernard_knows[0]\n", " else:\n", " return \"The solution could not be determined.\"\n", "\n", "# Example usage:\n", "possible_dates = [\n", " (\"May\", 15, \"Friday\"),\n", " (\"May\", 16, \"Saturday\"),\n", " (\"May\", 19, \"Tuesday\"),\n", " (\"June\", 17, \"Thursday\"),\n", " (\"June\", 18, \"Friday\"),\n", " (\"July\", 14, \"Monday\"),\n", " (\"July\", 16, \"Wednesday\"),\n", " (\"August\", 14, \"Wednesday\"),\n", " (\"August\", 15, \"Thursday\"),\n", " (\"August\", 17, \"Saturday\"),\n", "]\n", "\n", "solution = solve_cheryls_birthday(possible_dates)\n", "print(f\"Cheryl's birthday is: {solution}\")" ] }, { "cell_type": "markdown", "id": "9d10112a-299e-49be-96ce-cf9551564894", "metadata": {}, "source": [ "*[This program seems promising in that it has separate variables for \"albert_knows\" and \"bernard_knows\", but it does not handle them correctly. The program introduces days of the week, which is extraneous, but not harmful. A bigger issue is that it accepts \"possible_dates\" as a parameter, but then ignores them and defines \"month_days\" based on the original dates, not on the input.]*" ] }, { "cell_type": "markdown", "id": "8379b2f8-9591-4f7e-be42-e5917ce0263c", "metadata": {}, "source": [ "## HuggingFace Chat\n", "\n", "*[HuggingFace is unique in suggesting a class definition, but still makes a similar mistake. Also, while it accepts a list of possible dates in the initializer, it then hard-codes certain months and days from the original problem.]*" ] }, { "cell_type": "code", "execution_count": 8, "id": "e0d33270-0f02-4a70-8258-4b87aab63b01", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[]\n" ] } ], "source": [ "class CherylBirthdayProblem:\n", " def __init__(self, possible_dates):\n", " self.possible_dates = possible_dates \n", " \n", " def filter_dates(self, month, day):\n", " return [date for date in self.possible_dates if date[0] == month and date[1] == day]\n", " \n", " def filter_months(self, month):\n", " return [date for date in self.possible_dates if date[0] == month]\n", " \n", " def filter_days(self, day):\n", " return [date for date in self.possible_dates if date[1] == day]\n", " \n", " def solve(self):\n", " # Albert's first statement\n", " self.possible_dates = self.filter_months(5) + self.filter_months(6)\n", " \n", " # Bernard's first statement\n", " self.possible_dates = [date for date in self.possible_dates if date[1]!= 19]\n", " \n", " # Albert's second statement\n", " self.possible_dates = self.filter_days(14) + self.filter_days(15) + self.filter_days(16) + self.filter_days(17) + self.filter_days(18)\n", " \n", " # Bernard's second statement\n", " self.possible_dates = [date for date in self.possible_dates if date[1]!= 15 and date[0]!= 6]\n", " \n", " # Albert's third statement\n", " self.possible_dates = self.filter_months(7)\n", " \n", " # Bernard's third statement\n", " self.possible_dates = self.filter_days(16)\n", " \n", " return self.possible_dates\n", "\n", "possible_dates = [(5, 14), (5, 15), (5, 16), (6, 15), (6, 16), (6, 17), (6, 18), (6, 19), \n", " (7, 14), (7, 15), (7, 16), (7, 17), (7, 18), (8, 14), (8, 15), (8, 16), \n", " (8, 17), (8, 18), (8, 19)]\n", "problem = CherylBirthdayProblem(possible_dates)\n", "print(problem.solve())" ] }, { "cell_type": "markdown", "id": "39904aa9-d97f-4b0c-a2d0-4731c2d4aff0", "metadata": {}, "source": [ "## You.com Genius Mode\n", "\n", "*[You.com makes a similar mistake, although the end result is that it selects the wrong date rather than no date.]*" ] }, { "cell_type": "code", "execution_count": 9, "id": "232deb66-b208-4b65-9703-4519dc88c528", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "('May', 19)\n" ] } ], "source": [ "def cheryls_birthday(possible_dates):\n", " # Step 1: Albert knows Bernard doesn't know. So, the month can't be one with a unique day.\n", " step1 = [date for date in possible_dates \n", " if not any(possible_dates.count(x) == 1 \n", " for x in [d for m, d in possible_dates if m == date[0]])]\n", "\n", " # Step 2: Bernard now knows. So, the day must be unique among the remaining possible dates.\n", " step2 = [date for date in step1 if sum(date[1] == d for m, d in step1) == 1]\n", "\n", " # Step 3: Albert now knows. So, the month must be unique among the remaining possible dates.\n", " step3 = [date for date in step2 if sum(date[0] == m for m, d in step2) == 1]\n", "\n", " return step3[0] if step3 else None\n", "\n", "# List of possible dates\n", "possible_dates = [('May', 15), ('May', 16), ('May', 19), ('June', 17), ('June', 18), \n", " ('July', 14), ('July', 16), ('August', 14), ('August', 15), ('August', 17)]\n", "\n", "print(cheryls_birthday(possible_dates))" ] }, { "cell_type": "markdown", "id": "3187709b-e6ff-45f7-9e06-a09420fdd79e", "metadata": {}, "source": [ "# 2025: Programming\n", "\n", "In 2025, three of the LLMs solved the programming test. I'll show Gemini and Claude. I'll test them on the original dates (which each program defines) and on the following three sets of alternative dates, two that have a unique solution, and one with no solution." ] }, { "cell_type": "code", "execution_count": 10, "id": "5f2927a6-edf7-42a7-afdf-37a8f272de37", "metadata": {}, "outputs": [], "source": [ "alt_dates = [('January', 15), ('January', 4), ('July', 13), ('July', 24), ('July', 30), \n", " ('March', 13), ('March', 24), ('May', 11), ('May', 17), ('May', 30)] # Answer: July 30\n", "\n", "alt_dates2 = [('April', 17), ('April', 18), ('April', 28), ('July', 16), ('July', 17), \n", " ('July', 19), ('June', 16), ('June', 29), ('March', 18), ('March', 19)] # Answer: March 18\n", "\n", "bad_dates = [('January', 15), ('January', 4), ('July', 13), ('July', 24), ('July', 30), \n", " ('March', 13), ('March', 24), ('May', 11), ('May', 17), ('May', 13)] # Answer: None" ] }, { "cell_type": "markdown", "id": "ea5f5d9e-2cca-44f9-881d-3ae1b5097fa0", "metadata": {}, "source": [ "## Gemini Pro 2.5\n", "\n", "Here's the program from **Gemini Pro 2.5** in 2025. It passes all the tests. However, there is a serious flaw: some of the explanatory output that is produces consists of static strings hard-coded to the original set of dates. The program still gets the right solution when given an alternative set of dates as input, but the explanatory output is wrong. (Also, I would have provided a keyword argument to make the printing optional.) I have to say that overall I like the style of [my program](Cheryl.ipynb) better, but this program certainly does the job (other than the issue with the explanatory output). Gemini's \"Show Thinking\" description of the program (not shown here) was also cogent, helpful, and correct." ] }, { "cell_type": "code", "execution_count": 11, "id": "e753ed64-7ac6-48ab-b729-ba838caf0987", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "--- Starting Cheryl's Birthday Puzzle Solver ---\n", "\n", "## Initial Possible Dates\n", "Possible dates:\n", "- August: 14, 15, 17\n", "- July: 14, 16\n", "- June: 17, 18\n", "- May: 15, 16, 19\n", "(10 possibilities)\n", "\n", "## After Albert knows Bernard doesn't know\n", "Possible dates:\n", "- August: 14, 15, 17\n", "- July: 14, 16\n", "(5 possibilities)\n", " *Logic: We removed months (like May and June) that contained 'unique' days (18 or 19).\n", " If Albert had May or June, he couldn't be certain Bernard didn't know.*\n", "\n", "## After Albert says 'I don't know'\n", "Possible dates:\n", "- August: 14, 15, 17\n", "- July: 14, 16\n", "(5 possibilities)\n", " *Logic: Albert's month must still have multiple options. (In this case, no dates were removed)*\n", "\n", "## After Bernard says 'I know now'\n", "Possible dates:\n", "- August: 15, 17\n", "- July: 16\n", "(3 possibilities)\n", " *Logic: We looked at the current list (July & August dates). We kept only those dates\n", " whose day appears exactly once in this list (16, 15, 17). We removed 14 because\n", " it appeared twice, meaning Bernard wouldn't know if his day was 14.*\n", "\n", "## After Albert says 'Then I also know'\n", "Possible dates:\n", "- July: 16\n", "(1 possibilities)\n", " *Logic: We looked at the current list (July 16, Aug 15, Aug 17). We kept only those dates\n", " whose month appears exactly once. August appears twice, so if Albert had August,\n", " he wouldn't know. Since he knows, he must have July.*\n", "\n", "========================================\n", "🎉 The solution is: July 16 🎉\n", "========================================\n" ] } ], "source": [ "from collections import Counter\n", "\n", "def print_dates(dates, step_message):\n", " \"\"\"Helper function to print the current set of possible dates.\"\"\"\n", " print(f\"\\n## {step_message}\")\n", " if not dates:\n", " print(\"No possible dates remain.\")\n", " return\n", " months = sorted(list(set(d[0] for d in dates)))\n", " print(\"Possible dates:\")\n", " for month in months:\n", " days = sorted([d[1] for d in dates if d[0] == month])\n", " print(f\"- {month}: {', '.join(map(str, days))}\")\n", " print(f\"({len(dates)} possibilities)\")\n", "\n", "def solve_cheryl_birthday(possible_dates):\n", " \"\"\"\n", " Solves Cheryl's Birthday puzzle based on the statements.\n", "\n", " Args:\n", " possible_dates: A list of tuples, where each tuple is (month, day).\n", " \"\"\"\n", "\n", " print(\"--- Starting Cheryl's Birthday Puzzle Solver ---\")\n", " print_dates(possible_dates, \"Initial Possible Dates\")\n", "\n", " # --- Statement 1: Albert: \"I don't know when Cheryl's birthday is,\n", " # and I know that Bernard does not know.\" ---\n", "\n", " # Part 1.1: Bernard does not know.\n", " # If a day is unique in the initial list, Bernard *would* know if he\n", " # was told that day. Albert *knows* Bernard doesn't know. This means\n", " # Albert's month *cannot* contain any day that is initially unique.\n", " # So, we first find the days that appear only once (unique days).\n", " initial_day_counts = Counter(d[1] for d in possible_dates)\n", " unique_days = {day for day, count in initial_day_counts.items() if count == 1}\n", "\n", " # Now, we filter out any date whose day is unique. If a month contains\n", " # *any* unique day, Albert couldn't be sure Bernard didn't know.\n", " # So, we must eliminate any month that has a date with a unique day.\n", " months_with_unique_days = {d[0] for d in possible_dates if d[1] in unique_days}\n", " dates_after_albert1a = [d for d in possible_dates if d[0] not in months_with_unique_days]\n", "\n", " print_dates(dates_after_albert1a, \"After Albert knows Bernard doesn't know\")\n", " print(\" *Logic: We removed months (like May and June) that contained 'unique' days (18 or 19).\")\n", " print(\" If Albert had May or June, he couldn't be certain Bernard didn't know.*\")\n", "\n", " # Part 1.2: Albert does not know.\n", " # This means that *after* the previous filtering (or even before),\n", " # Albert's month must still have more than one possible date.\n", " # We check the months remaining in 'dates_after_albert1a'. If any of\n", " # these months now has only one date, it couldn't be Albert's month.\n", " # However, in this specific problem, this step doesn't eliminate more dates,\n", " # as both July and August still have multiple options. We'll keep the dates\n", " # from the previous step.\n", " dates_after_albert1 = dates_after_albert1a\n", " # Optional: Add a check here if needed for other datasets.\n", " # month_counts_a1 = Counter(d[0] for d in dates_after_albert1a)\n", " # dates_after_albert1 = [d for d in dates_after_albert1a if month_counts_a1[d[0]] > 1]\n", " # In this case, it yields the same result.\n", "\n", " print_dates(dates_after_albert1, \"After Albert says 'I don't know'\")\n", " print(\" *Logic: Albert's month must still have multiple options. (In this case, no dates were removed)*\")\n", "\n", "\n", " # --- Statement 2: Bernard: \"At first I didn't know..., but I know now.\" ---\n", " # Bernard heard Albert's statement, so he knows the list is now\n", " # 'dates_after_albert1'. Since he *now* knows, it means his day\n", " # must be unique within *this new list*.\n", " day_counts_b1 = Counter(d[1] for d in dates_after_albert1)\n", " dates_after_bernard1 = [d for d in dates_after_albert1 if day_counts_b1[d[1]] == 1]\n", "\n", " print_dates(dates_after_bernard1, \"After Bernard says 'I know now'\")\n", " print(\" *Logic: We looked at the current list (July & August dates). We kept only those dates\")\n", " print(\" whose day appears exactly once in this list (16, 15, 17). We removed 14 because\")\n", " print(\" it appeared twice, meaning Bernard wouldn't know if his day was 14.*\")\n", "\n", " # --- Statement 3: Albert: \"Then I also know when Cheryl's birthday is.\" ---\n", " # Albert heard Bernard's statement, so he knows the list is now\n", " # 'dates_after_bernard1'. Since Albert *now* knows, it means his\n", " # month must be unique within *this new list*.\n", " month_counts_a2 = Counter(d[0] for d in dates_after_bernard1)\n", " final_dates = [d for d in dates_after_bernard1 if month_counts_a2[d[0]] == 1]\n", "\n", " print_dates(final_dates, \"After Albert says 'Then I also know'\")\n", " print(\" *Logic: We looked at the current list (July 16, Aug 15, Aug 17). We kept only those dates\")\n", " print(\" whose month appears exactly once. August appears twice, so if Albert had August,\")\n", " print(\" he wouldn't know. Since he knows, he must have July.*\")\n", "\n", " # --- The Solution ---\n", " print(\"\\n\" + \"=\"*40)\n", " if len(final_dates) == 1:\n", " print(f\"🎉 The solution is: {final_dates[0][0]} {final_dates[0][1]} 🎉\")\n", " elif len(final_dates) > 1:\n", " print(\"🤔 The statements lead to multiple possibilities.\")\n", " else:\n", " print(\"😥 No solution found based on the statements.\")\n", " print(\"=\"*40)\n", "\n", " return final_dates\n", "\n", "# --- Run the puzzle with the original dates ---\n", "if __name__ == \"__main__\":\n", " cheryl_dates = [\n", " ('May', 15), ('May', 16), ('May', 19),\n", " ('June', 17), ('June', 18),\n", " ('July', 14), ('July', 16),\n", " ('August', 14), ('August', 15), ('August', 17)\n", " ]\n", "\n", " solve_cheryl_birthday(cheryl_dates)" ] }, { "cell_type": "markdown", "id": "e7739c8c-742b-4ccc-b22a-76981271ce31", "metadata": {}, "source": [ "Let's verify that it computes the correct solution with the alternative sets of dates:" ] }, { "cell_type": "code", "execution_count": 12, "id": "279abec2-552b-47f8-a53f-e34c2998f196", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "--- Starting Cheryl's Birthday Puzzle Solver ---\n", "\n", "## Initial Possible Dates\n", "Possible dates:\n", "- January: 4, 15\n", "- July: 13, 24, 30\n", "- March: 13, 24\n", "- May: 11, 17, 30\n", "(10 possibilities)\n", "\n", "## After Albert knows Bernard doesn't know\n", "Possible dates:\n", "- July: 13, 24, 30\n", "- March: 13, 24\n", "(5 possibilities)\n", " *Logic: We removed months (like May and June) that contained 'unique' days (18 or 19).\n", " If Albert had May or June, he couldn't be certain Bernard didn't know.*\n", "\n", "## After Albert says 'I don't know'\n", "Possible dates:\n", "- July: 13, 24, 30\n", "- March: 13, 24\n", "(5 possibilities)\n", " *Logic: Albert's month must still have multiple options. (In this case, no dates were removed)*\n", "\n", "## After Bernard says 'I know now'\n", "Possible dates:\n", "- July: 30\n", "(1 possibilities)\n", " *Logic: We looked at the current list (July & August dates). We kept only those dates\n", " whose day appears exactly once in this list (16, 15, 17). We removed 14 because\n", " it appeared twice, meaning Bernard wouldn't know if his day was 14.*\n", "\n", "## After Albert says 'Then I also know'\n", "Possible dates:\n", "- July: 30\n", "(1 possibilities)\n", " *Logic: We looked at the current list (July 16, Aug 15, Aug 17). We kept only those dates\n", " whose month appears exactly once. August appears twice, so if Albert had August,\n", " he wouldn't know. Since he knows, he must have July.*\n", "\n", "========================================\n", "🎉 The solution is: July 30 🎉\n", "========================================\n" ] } ], "source": [ "assert solve_cheryl_birthday(alt_dates) == [('July', 30)]" ] }, { "cell_type": "code", "execution_count": 13, "id": "01ce560e-716b-4809-94fe-9a9704b9bf29", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "--- Starting Cheryl's Birthday Puzzle Solver ---\n", "\n", "## Initial Possible Dates\n", "Possible dates:\n", "- April: 17, 18, 28\n", "- July: 16, 17, 19\n", "- June: 16, 29\n", "- March: 18, 19\n", "(10 possibilities)\n", "\n", "## After Albert knows Bernard doesn't know\n", "Possible dates:\n", "- July: 16, 17, 19\n", "- March: 18, 19\n", "(5 possibilities)\n", " *Logic: We removed months (like May and June) that contained 'unique' days (18 or 19).\n", " If Albert had May or June, he couldn't be certain Bernard didn't know.*\n", "\n", "## After Albert says 'I don't know'\n", "Possible dates:\n", "- July: 16, 17, 19\n", "- March: 18, 19\n", "(5 possibilities)\n", " *Logic: Albert's month must still have multiple options. (In this case, no dates were removed)*\n", "\n", "## After Bernard says 'I know now'\n", "Possible dates:\n", "- July: 16, 17\n", "- March: 18\n", "(3 possibilities)\n", " *Logic: We looked at the current list (July & August dates). We kept only those dates\n", " whose day appears exactly once in this list (16, 15, 17). We removed 14 because\n", " it appeared twice, meaning Bernard wouldn't know if his day was 14.*\n", "\n", "## After Albert says 'Then I also know'\n", "Possible dates:\n", "- March: 18\n", "(1 possibilities)\n", " *Logic: We looked at the current list (July 16, Aug 15, Aug 17). We kept only those dates\n", " whose month appears exactly once. August appears twice, so if Albert had August,\n", " he wouldn't know. Since he knows, he must have July.*\n", "\n", "========================================\n", "🎉 The solution is: March 18 🎉\n", "========================================\n" ] } ], "source": [ "assert solve_cheryl_birthday(alt_dates2) == [('March', 18)]" ] }, { "cell_type": "code", "execution_count": 14, "id": "4a836fd6-c908-4667-857c-0da26cba394d", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "--- Starting Cheryl's Birthday Puzzle Solver ---\n", "\n", "## Initial Possible Dates\n", "Possible dates:\n", "- January: 4, 15\n", "- July: 13, 24, 30\n", "- March: 13, 24\n", "- May: 11, 13, 17\n", "(10 possibilities)\n", "\n", "## After Albert knows Bernard doesn't know\n", "Possible dates:\n", "- March: 13, 24\n", "(2 possibilities)\n", " *Logic: We removed months (like May and June) that contained 'unique' days (18 or 19).\n", " If Albert had May or June, he couldn't be certain Bernard didn't know.*\n", "\n", "## After Albert says 'I don't know'\n", "Possible dates:\n", "- March: 13, 24\n", "(2 possibilities)\n", " *Logic: Albert's month must still have multiple options. (In this case, no dates were removed)*\n", "\n", "## After Bernard says 'I know now'\n", "Possible dates:\n", "- March: 13, 24\n", "(2 possibilities)\n", " *Logic: We looked at the current list (July & August dates). We kept only those dates\n", " whose day appears exactly once in this list (16, 15, 17). We removed 14 because\n", " it appeared twice, meaning Bernard wouldn't know if his day was 14.*\n", "\n", "## After Albert says 'Then I also know'\n", "No possible dates remain.\n", " *Logic: We looked at the current list (July 16, Aug 15, Aug 17). We kept only those dates\n", " whose month appears exactly once. August appears twice, so if Albert had August,\n", " he wouldn't know. Since he knows, he must have July.*\n", "\n", "========================================\n", "😥 No solution found based on the statements.\n", "========================================\n" ] } ], "source": [ "assert solve_cheryl_birthday(bad_dates) == []" ] }, { "cell_type": "markdown", "id": "c1962ed7-21ba-4a0d-a251-ef1ba91f2584", "metadata": {}, "source": [ "## Claude Sonnet 4\n", "\n", "Now for the program from **Anthropic Claude Sonnet 4** in 2025. It is similar to the Gemini program. It too produces explanatory output (that can't be turned off). With Claude, the explanatory output is not hard-coded to the original dates, so overall it is a better program." ] }, { "cell_type": "code", "execution_count": 15, "id": "6f53ddf4-7e3d-41d8-9bc8-fce73bb92d45", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "=== SOLVING ORIGINAL CHERYL'S BIRTHDAY PROBLEM ===\n", "Possible dates: [('May', 15), ('May', 16), ('May', 19), ('June', 17), ('June', 18), ('July', 14), ('July', 16), ('August', 14), ('August', 15), ('August', 17)]\n", "\n", "Unique days (appear only once): [18, 19]\n", "Months containing unique days: ['June', 'May']\n", "Valid months for Albert (don't contain unique days): ['August', 'July']\n", "Remaining dates after Albert's statement: [('July', 14), ('July', 16), ('August', 14), ('August', 15), ('August', 17)]\n", "\n", "Days that appear only once in remaining dates: [15, 16, 17]\n", "Remaining dates after Bernard's statement: [('July', 16), ('August', 15), ('August', 17)]\n", "\n", "Months that appear only once in remaining dates: ['July']\n", "Final answer: [('July', 16)]\n", "\n", "🎉 Cheryl's birthday is: July 16\n" ] } ], "source": [ "def solve_cheryls_birthday(possible_dates):\n", " \"\"\"\n", " Solve Cheryl's Birthday problem for any list of possible dates.\n", " \n", " Args:\n", " possible_dates: List of tuples (month, day) representing possible birthdays\n", " \n", " Returns:\n", " The unique solution as a tuple (month, day), or None if no unique solution\n", " \"\"\"\n", " \n", " print(f\"Possible dates: {possible_dates}\")\n", " print()\n", " \n", " # Step 1: Albert says \"I don't know when Cheryl's birthday is, \n", " # but I know that Bernard doesn't know either.\"\n", " \n", " # For Albert to know that Bernard doesn't know, Albert's month must not \n", " # contain any unique days (days that appear only once across all dates)\n", " \n", " # Find all unique days (days that appear only once)\n", " day_counts = {}\n", " for month, day in possible_dates:\n", " day_counts[day] = day_counts.get(day, 0) + 1\n", " \n", " unique_days = {day for day, count in day_counts.items() if count == 1}\n", " print(f\"Unique days (appear only once): {sorted(unique_days)}\")\n", " \n", " # Find months that contain unique days\n", " months_with_unique_days = set()\n", " for month, day in possible_dates:\n", " if day in unique_days:\n", " months_with_unique_days.add(month)\n", " \n", " print(f\"Months containing unique days: {sorted(months_with_unique_days)}\")\n", " \n", " # Albert's month cannot be one that contains unique days\n", " # (otherwise he couldn't be sure Bernard doesn't know)\n", " valid_months = set()\n", " for month, day in possible_dates:\n", " if month not in months_with_unique_days:\n", " valid_months.add(month)\n", " \n", " print(f\"Valid months for Albert (don't contain unique days): {sorted(valid_months)}\")\n", " \n", " # Filter possible dates to only those in valid months\n", " after_step1 = [(month, day) for month, day in possible_dates \n", " if month in valid_months]\n", " \n", " print(f\"Remaining dates after Albert's statement: {after_step1}\")\n", " print()\n", " \n", " # Step 2: Bernard says \"At first I didn't know when Cheryl's birthday is, \n", " # but I know now.\"\n", " \n", " # For Bernard to know now, his day must appear only once in the remaining dates\n", " remaining_day_counts = {}\n", " for month, day in after_step1:\n", " remaining_day_counts[day] = remaining_day_counts.get(day, 0) + 1\n", " \n", " unique_remaining_days = {day for day, count in remaining_day_counts.items() \n", " if count == 1}\n", " \n", " print(f\"Days that appear only once in remaining dates: {sorted(unique_remaining_days)}\")\n", " \n", " # Filter to dates where the day is unique in remaining set\n", " after_step2 = [(month, day) for month, day in after_step1 \n", " if day in unique_remaining_days]\n", " \n", " print(f\"Remaining dates after Bernard's statement: {after_step2}\")\n", " print()\n", " \n", " # Step 3: Albert says \"Then I also know when Cheryl's birthday is.\"\n", " \n", " # For Albert to know, there must be only one date left with his month\n", " month_counts = {}\n", " for month, day in after_step2:\n", " month_counts[month] = month_counts.get(month, 0) + 1\n", " \n", " # Find months that appear only once in remaining dates\n", " unique_remaining_months = {month for month, count in month_counts.items() if count == 1}\n", " \n", " print(f\"Months that appear only once in remaining dates: {sorted(unique_remaining_months)}\")\n", " \n", " # Final answer: dates where the month is unique in remaining set\n", " final_dates = [(month, day) for month, day in after_step2 \n", " if month in unique_remaining_months]\n", " \n", " print(f\"Final answer: {final_dates}\")\n", " \n", " if len(final_dates) == 1:\n", " return final_dates[0]\n", " else:\n", " return None\n", "\n", "# Original Cheryl's Birthday dates\n", "original_dates = [\n", " (\"May\", 15), (\"May\", 16), (\"May\", 19),\n", " (\"June\", 17), (\"June\", 18),\n", " (\"July\", 14), (\"July\", 16),\n", " (\"August\", 14), (\"August\", 15), (\"August\", 17)\n", "]\n", "\n", "print(\"=== SOLVING ORIGINAL CHERYL'S BIRTHDAY PROBLEM ===\")\n", "solution = solve_cheryls_birthday(original_dates)\n", "if solution:\n", " print(f\"\\n🎉 Cheryl's birthday is: {solution[0]} {solution[1]}\")\n", "else:\n", " print(\"\\n❌ No unique solution found\")" ] }, { "cell_type": "markdown", "id": "74b594df-5ca2-4cce-b880-6c7a42c7cb4e", "metadata": {}, "source": [ "Here we verify that it works on the alternative dates:" ] }, { "cell_type": "code", "execution_count": 16, "id": "943483b6-1e50-4d9a-a530-26d207541cfb", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Possible dates: [('January', 15), ('January', 4), ('July', 13), ('July', 24), ('July', 30), ('March', 13), ('March', 24), ('May', 11), ('May', 17), ('May', 30)]\n", "\n", "Unique days (appear only once): [4, 11, 15, 17]\n", "Months containing unique days: ['January', 'May']\n", "Valid months for Albert (don't contain unique days): ['July', 'March']\n", "Remaining dates after Albert's statement: [('July', 13), ('July', 24), ('July', 30), ('March', 13), ('March', 24)]\n", "\n", "Days that appear only once in remaining dates: [30]\n", "Remaining dates after Bernard's statement: [('July', 30)]\n", "\n", "Months that appear only once in remaining dates: ['July']\n", "Final answer: [('July', 30)]\n" ] } ], "source": [ "assert solve_cheryls_birthday(alt_dates) == ('July', 30)" ] }, { "cell_type": "code", "execution_count": 17, "id": "6a714ed0-94a4-4fcf-a125-fbed4c1e7499", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Possible dates: [('April', 17), ('April', 18), ('April', 28), ('July', 16), ('July', 17), ('July', 19), ('June', 16), ('June', 29), ('March', 18), ('March', 19)]\n", "\n", "Unique days (appear only once): [28, 29]\n", "Months containing unique days: ['April', 'June']\n", "Valid months for Albert (don't contain unique days): ['July', 'March']\n", "Remaining dates after Albert's statement: [('July', 16), ('July', 17), ('July', 19), ('March', 18), ('March', 19)]\n", "\n", "Days that appear only once in remaining dates: [16, 17, 18]\n", "Remaining dates after Bernard's statement: [('July', 16), ('July', 17), ('March', 18)]\n", "\n", "Months that appear only once in remaining dates: ['March']\n", "Final answer: [('March', 18)]\n" ] } ], "source": [ "assert solve_cheryls_birthday(alt_dates2) == ('March', 18)" ] }, { "cell_type": "code", "execution_count": 18, "id": "e1d81eb7-053f-4c32-97b4-8a3642a2a5b9", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Possible dates: [('January', 15), ('January', 4), ('July', 13), ('July', 24), ('July', 30), ('March', 13), ('March', 24), ('May', 11), ('May', 17), ('May', 13)]\n", "\n", "Unique days (appear only once): [4, 11, 15, 17, 30]\n", "Months containing unique days: ['January', 'July', 'May']\n", "Valid months for Albert (don't contain unique days): ['March']\n", "Remaining dates after Albert's statement: [('March', 13), ('March', 24)]\n", "\n", "Days that appear only once in remaining dates: [13, 24]\n", "Remaining dates after Bernard's statement: [('March', 13), ('March', 24)]\n", "\n", "Months that appear only once in remaining dates: []\n", "Final answer: []\n" ] } ], "source": [ "assert solve_cheryls_birthday(bad_dates) == None" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.15" } }, "nbformat": 4, "nbformat_minor": 5 }