Add files via upload
This commit is contained in:
		
							parent
							
								
									905dca4158
								
							
						
					
					
						commit
						77b8f73025
					
				| @ -10,10 +10,10 @@ | ||||
|     "\n", | ||||
|     "The [538 Riddler for 11 June 2021](https://fivethirtyeight.com/features/can-you-split-the-states/) poses this problem  (paraphrased):\n", | ||||
|     "\n", | ||||
|     "*Given a map of the lower 48 states of the  USA, remove a subset of the states so that the map is cut into two  disjoint contiguous regions that are near-halves by area. Since Michigan’s upper and lower peninsulas are non-contiguous, you can treat them as two separate \"states,\" for a total of 49.*\n", | ||||
|     "*Given a map of the lower 48 states of the  USA, remove a subset of the states so that the map is cut into two  disjoint contiguous regions that are near-halves by area. Since Michigan’s upper and lower peninsulas are non-contiguous, you can treat them as two separate \"states,\" for a total of 49. \"Disjoint\" means the two regions are distinct and not adjacent (due to the cut).*\n", | ||||
|     "\n", | ||||
|     "Since \"near-halves\" is vague,  answer the questions:\n", | ||||
|     "1) What states should you remove to maximize the area of the smaller region? \n", | ||||
|     "Since \"near-halves\" is vague,  answer the two questions:\n", | ||||
|     "1) What states should you remove to maximize the area of the smaller of the two regions? \n", | ||||
|     "2) What states should you remove to minimize the difference of the areas of the two regions? \n", | ||||
|     "\n", | ||||
|     "# Vocabulary terms \n", | ||||
| @ -32,7 +32,9 @@ | ||||
|     "\n", | ||||
|     "\n", | ||||
|     "\n", | ||||
|     "Here is the code to implement the concepts, with the help of two sites that had data on state's [neighbors](https://theincidentaleconomist.com/wordpress/list-of-neighboring-states-with-stata-code/) and [area](https://www.census.gov/geographies/reference-files/2010/geo/state-area.html):" | ||||
|     "Credit goes to [mapchart.net](https://mapchart.net), which is the only map-coloring site I've found that allows you to split Michigan. \n", | ||||
|     "\n", | ||||
|     "Here is the code to implement the vocabulary concepts (with the help of two sites that had data on state's [neighbors](https://theincidentaleconomist.com/wordpress/list-of-neighboring-states-with-stata-code/) and [area](https://www.census.gov/geographies/reference-files/2010/geo/state-area.html)):" | ||||
|    ] | ||||
|   }, | ||||
|   { | ||||
| @ -48,9 +50,9 @@ | ||||
|     "Region = States      # A contiguous set of states (must be hashable)\n", | ||||
|     "Split  = Tuple[Region, Region, Region] # (A, B, C) = (large region, small region, cut)\n", | ||||
|     "\n", | ||||
|     "def states(string)   -> States: \"Set of states\";  return States(string.split())\n", | ||||
|     "def statedict(**dic) -> dict:   \"{State:States}\"; return {s: states(dic[s]) for s in dic}\n", | ||||
|     "def area(states)     -> int:    \"Total area\";     return sum(areas[s] for s in states)\n", | ||||
|     "def states(string: str)  -> States: \"Set of states\";  return States(string.split())\n", | ||||
|     "def statedict(**dic)     -> dict:   \"{State:States}\"; return {s: states(dic[s]) for s in dic}\n", | ||||
|     "def area(states: States) -> int:    \"Total area\";     return sum(areas[s] for s in states)\n", | ||||
|     "\n", | ||||
|     "neighbors = statedict(\n", | ||||
|     "    AK='', AL='FL GA MS TN', AR='LA MO MS OK TN TX', AZ='CA CO NM NV UT', CA='AZ NV OR', \n", | ||||
| @ -97,7 +99,7 @@ | ||||
|     "# Strategy for answering the two questions\n", | ||||
|     "\n", | ||||
|     "My overall strategy:\n", | ||||
|     "- Exhaustively  (with some constraints) generate a large number of **cuts**. \n", | ||||
|     "- Generate a large number of **cuts**. \n", | ||||
|     "- For each cut *C*, determine the **split** into regions *A* and *B*. \n", | ||||
|     "- From the valid splits, find the ones that:\n", | ||||
|     "  1) **Maximize the area** of *B*\n", | ||||
| @ -108,8 +110,8 @@ | ||||
|     "Is it feasible to consider all possible cuts? A cut is a subset of the 49 states, so there are 2<sup>49</sup> or 500 trillion possible cuts, so **no**, we can't look at them all.  I have four ideas to reduce the number of cuts considered:\n", | ||||
|     "- **Limit the total area in a cut.** A large area in the cut means there won't be much area left to make *B* big for question 1. By default, I'll limit the area of the cut to be 1/5 the area of the country.\n", | ||||
|     "- **Limit the number of states in a cut.** Similarly, if there are too many states in a cut, there won't be many left for *A* or *B*. By default, the limit is 8.\n", | ||||
|     "- **Make cuts contiguous.** Noncontiguous cuts can't be optimal for question 1, so I won't consider them (although they can answer question 2).\n", | ||||
|     "- **Make cuts go border-to-border.** A cut can produce exactly two regions only if (a) the cut runs from one place on the border to another place on the border or (b) the cut forms a \"donut\" that surrounds some interior region. The US map isn't big enough to support a decent-sized donut (only KS and NE are not neighbors of a border state). Also, the US map seems to be too wide to have a good East-West cut, so I'll require a cut to go from the northern border state to the southern border.\n", | ||||
|     "- **Make cuts contiguous.** Noncontiguous cuts can't be optimal for question 1, so for now I won't consider them (although they can provide a good answer for question 2).\n", | ||||
|     "- **Make cuts go border-to-border.** A cut can produce exactly two regions only if (a) the cut runs from one place on the border to another place on the border or (b) the cut forms a \"donut\" that surrounds some interior region. The US map isn't big enough to support a decent-sized donut (only KS and NE are not neighbors of a border state). Also, the US map seems to be too wide to have a good East-West cut, so I'll start with cuts that go from the northern border to the southern border.\n", | ||||
|     "\n", | ||||
|     "The function `make_cuts` starts by building a set of partial cuts where each cut initially contains a single `start` state. Then in each iteration of the `while` loop, it yields any cut that has reached an `end` state, and creates a new set of cuts formed by adding a neighboring state to a current cut in all possible ways, as long as the area does not exceed `maxarea` and the size does not exceed `maxsize`. " | ||||
|    ] | ||||
| @ -147,12 +149,12 @@ | ||||
|     { | ||||
|      "data": { | ||||
|       "text/plain": [ | ||||
|        "{frozenset({'CA', 'ID', 'OR'}),\n", | ||||
|        " frozenset({'CA', 'OR', 'WA'}),\n", | ||||
|        "{frozenset({'AZ', 'ID', 'UT'}),\n", | ||||
|        " frozenset({'AZ', 'ID', 'NV'}),\n", | ||||
|        " frozenset({'ID', 'NM', 'UT'}),\n", | ||||
|        " frozenset({'CA', 'ID', 'NV'}),\n", | ||||
|        " frozenset({'AZ', 'ID', 'UT'})}" | ||||
|        " frozenset({'CA', 'OR', 'WA'}),\n", | ||||
|        " frozenset({'ID', 'NM', 'UT'}),\n", | ||||
|        " frozenset({'CA', 'ID', 'OR'})}" | ||||
|       ] | ||||
|      }, | ||||
|      "execution_count": 3, | ||||
| @ -168,7 +170,7 @@ | ||||
|    "cell_type": "markdown", | ||||
|    "metadata": {}, | ||||
|    "source": [ | ||||
|     "How many cuts are there in `usa49` with the default parameter values?" | ||||
|     "How many cuts are there in `usa49` (using the default parameter values)?" | ||||
|    ] | ||||
|   }, | ||||
|   { | ||||
| @ -195,9 +197,11 @@ | ||||
|    "cell_type": "markdown", | ||||
|    "metadata": {}, | ||||
|    "source": [ | ||||
|     "That's a more manageable number than 500 trillion.\n", | ||||
|     "\n", | ||||
|     "# Making splits\n", | ||||
|     "\n", | ||||
|     "Now, given some candidate cuts, the function `make_splits` creates **splits**: tuples `(A, B, C)`, where `A` and `B` are the two regions defined by the cut `C`, with `A` larger than `B`. A split requires that the cut divides the country into exactly two regions, but not all cuts do that.  For example, the cut {WA, OR, CA} leaves only one region (consisting of all the other states). The cut {ID, OR, NV, AZ} leaves three regions: {WA}, {CA}, and {everything else}. To verify whether the cut makes two regions, `make_splits` first finds a maximal-size contiguous region `A` from the non-cut states, then finds a region `B` from the remaining states, then checks that `A` and `B` are non-empty and make up all the non-cut states.\n", | ||||
|     "Given some candidate cuts, the function `make_splits` creates **splits**: tuples `(A, B, C)`, where `A` and `B` are the two regions defined by the cut `C`, with `A` larger than `B`. A split requires that the cut divides the country into exactly two regions, but not all cuts do that.  For example, the cut {WA, OR, CA} leaves only one region (consisting of all the other states). The cut {ID, OR, NV, AZ} leaves three regions: {WA}, {CA}, and {everything else}. To verify whether the cut makes two regions, `make_splits` first finds a maximal-size contiguous region `A` from the non-cut states, then finds a region `B` from the remaining states, then checks that `A` and `B` are non-empty and make up all the non-cut states.\n", | ||||
|     "\n", | ||||
|     "We find the extent of a region with the `contiguous` function, which implements a [flood fill algorithm](https://en.wikipedia.org/wiki/Flood_fill): it maintains a mutable `region` and a `frontier` of the states that neighbor the region but are not in the region. We iterate adding the frontier into the region and computing a new frontier until there is no new frontier. " | ||||
|    ] | ||||
| @ -218,14 +222,14 @@ | ||||
|     "            B, A = sorted([A, B], key=area) # Ensure A larger than B in area\n", | ||||
|     "            yield (A, B, C)\n", | ||||
|     "            \n", | ||||
|     "def contiguous(legal: States) -> Region:\n", | ||||
|     "    \"\"\"Starting at one state, fill out to all legal contiguous states; return them.\"\"\"\n", | ||||
|     "def contiguous(states: States) -> Region:\n", | ||||
|     "    \"\"\"Starting at one of the states, expand out to all contiguous states; return them.\"\"\"\n", | ||||
|     "    region   = set() \n", | ||||
|     "    frontier = {min(legal)} if legal else None\n", | ||||
|     "    frontier = {min(states)} if states else None\n", | ||||
|     "    while frontier:\n", | ||||
|     "        region |= frontier\n", | ||||
|     "        frontier = {s1 for s in frontier for s1 in neighbors[s]\n", | ||||
|     "                    if s1 in legal and s1 not in region}\n", | ||||
|     "                    if s1 in states and s1 not in region}\n", | ||||
|     "    return Region(region)" | ||||
|    ] | ||||
|   }, | ||||
| @ -244,12 +248,12 @@ | ||||
|     { | ||||
|      "data": { | ||||
|       "text/plain": [ | ||||
|        "[frozenset({'CA', 'ID', 'OR'}),\n", | ||||
|        "[frozenset({'AZ', 'ID', 'UT'}),\n", | ||||
|        " frozenset({'AZ', 'ID', 'NV'}),\n", | ||||
|        " frozenset({'ID', 'NM', 'UT'}),\n", | ||||
|        " frozenset({'CA', 'ID', 'NV'}),\n", | ||||
|        " frozenset({'CA', 'OR', 'WA'}),\n", | ||||
|        " frozenset({'AZ', 'ID', 'UT'})]" | ||||
|        " frozenset({'CA', 'ID', 'OR'}),\n", | ||||
|        " frozenset({'ID', 'NM', 'UT'})]" | ||||
|       ] | ||||
|      }, | ||||
|      "execution_count": 6, | ||||
| @ -269,18 +273,18 @@ | ||||
|     { | ||||
|      "data": { | ||||
|       "text/plain": [ | ||||
|        "[(frozenset({'AZ', 'CO', 'MT', 'NM', 'NV', 'UT', 'WY'}),\n", | ||||
|        "  frozenset({'WA'}),\n", | ||||
|        "  frozenset({'CA', 'ID', 'OR'})),\n", | ||||
|        "[(frozenset({'CO', 'MT', 'NM', 'WY'}),\n", | ||||
|        "  frozenset({'CA', 'NV', 'OR', 'WA'}),\n", | ||||
|        "  frozenset({'AZ', 'ID', 'UT'})),\n", | ||||
|        " (frozenset({'CO', 'MT', 'NM', 'UT', 'WY'}),\n", | ||||
|        "  frozenset({'CA', 'OR', 'WA'}),\n", | ||||
|        "  frozenset({'AZ', 'ID', 'NV'})),\n", | ||||
|        " (frozenset({'AZ', 'CO', 'MT', 'NM', 'UT', 'WY'}),\n", | ||||
|        "  frozenset({'OR', 'WA'}),\n", | ||||
|        "  frozenset({'CA', 'ID', 'NV'})),\n", | ||||
|        " (frozenset({'CO', 'MT', 'NM', 'WY'}),\n", | ||||
|        "  frozenset({'CA', 'NV', 'OR', 'WA'}),\n", | ||||
|        "  frozenset({'AZ', 'ID', 'UT'}))]" | ||||
|        " (frozenset({'AZ', 'CO', 'MT', 'NM', 'NV', 'UT', 'WY'}),\n", | ||||
|        "  frozenset({'WA'}),\n", | ||||
|        "  frozenset({'CA', 'ID', 'OR'}))]" | ||||
|       ] | ||||
|      }, | ||||
|      "execution_count": 7, | ||||
| @ -298,7 +302,7 @@ | ||||
|    "source": [ | ||||
|     "# The answers\n", | ||||
|     "\n", | ||||
|     "The function `answers` puts it all together: makes cuts; makes splits from those cuts; finds the splits that answer the two questions; and prints the results: for *A*, *B*, and *C*, print the total area, the percentage of the country's area, the number of states in the region, and the set of states in the region. Then describe the delta between *A* and *B*'s area." | ||||
|     "The function `answers` puts it all together: makes cuts; makes splits from those cuts; finds the splits that answer the two questions; and prints the results. For *A*, *B*, and *C*, it prints the total area, the percentage of the country's area, the number of states in the region, and the set of states in the region. Then it prints the delta between *A* and *B*'s area." | ||||
|    ] | ||||
|   }, | ||||
|   { | ||||
| @ -316,7 +320,7 @@ | ||||
|     "    answer1 = max(splits, key=lambda s: area(s[1]))\n", | ||||
|     "    answer2 = min(splits, key=lambda s: area(s[0]) - area(s[1]))\n", | ||||
|     "    print(f'{len(country)} states ⇒ {len(cuts):,d} cuts',\n", | ||||
|     "          f'(size ≤ {maxsize}, area ≤ {maxarea:,d}) ⇒ {len(splits):,d} splits.')\n", | ||||
|     "          f'(cut size ≤ {maxsize}, cut area ≤ {maxarea:,d}) ⇒ {len(splits):,d} splits.')\n", | ||||
|     "    show('1. Split that maximizes area(B)',               country, answer1)\n", | ||||
|     "    show('2. Split that minimizes ∆ = area(A) - area(B)', country, answer2)\n", | ||||
|     "    \n", | ||||
| @ -344,7 +348,7 @@ | ||||
|      "name": "stdout", | ||||
|      "output_type": "stream", | ||||
|      "text": [ | ||||
|       "49 states ⇒ 41,700 cuts (size ≤ 8, area ≤ 624,072) ⇒ 13,146 splits.\n", | ||||
|       "49 states ⇒ 41,700 cuts (cut size ≤ 8, cut area ≤ 624,072) ⇒ 13,146 splits.\n", | ||||
|       "\n", | ||||
|       "1. Split that maximizes area(B):\n", | ||||
|       "A|1,345,558|43.122%|29|AL AR CT DE FL GA IN KS KY LA LP MA MD ME MS NC NH NJ NY OH OK PA RI SC TN TX VA VT WV\n", | ||||
| @ -369,7 +373,7 @@ | ||||
|    "cell_type": "markdown", | ||||
|    "metadata": {}, | ||||
|    "source": [ | ||||
|     "Here are maps of those two cuts, representing the best answers to questions 1 and 2 with my default paramtert values:\n", | ||||
|     "Here are maps of those two cuts, representing the best answers to questions 1 and 2 with my default parameter values:\n", | ||||
|     "\n", | ||||
|     "\n", | ||||
|     "\n" | ||||
| @ -381,7 +385,7 @@ | ||||
|    "source": [ | ||||
|     "# Improving on question 1\n", | ||||
|     "\n", | ||||
|     "The {CO,IL,MO,NE,NM} cut gave us two regions with 43% of the area each. But we limited cuts to 10 states, so this may not be the best possible result. If we looked at all possible cuts, the run time would be too long.  Fortunately, we can tighten the area constraint. We saw above that the cut {CO,IL,MO,NE,NM} produces a region *B* with area 1,344,149, so that means that any cut that is better for question 1 must create a split where the areas of both *A* and *B* are greater than 1,344,149, Therefore, we can reduce the `maxarea` for the cut from `area(usa49)/5` (which is  624,072) down to:" | ||||
|     "The {CO,IL,MO,NE,NM} cut gave us two regions with 43% of the area each. But we limited cuts to 8 states, so this may not be the best possible result. If we looked at all possible cuts, the run time would be too long.  Fortunately, we can tighten the area constraint. We saw above that the cut {CO,IL,MO,NE,NM} produces a region *B* with area 1,344,149, so that means that any cut that is better for question 1 must create a split where the areas of both *A* and *B* are greater than 1,344,149, Therefore, we can reduce the `maxarea` for the cut from `area(usa49)/5` (which is  624,072) down to:" | ||||
|    ] | ||||
|   }, | ||||
|   { | ||||
| @ -421,7 +425,7 @@ | ||||
|      "name": "stdout", | ||||
|      "output_type": "stream", | ||||
|      "text": [ | ||||
|       "49 states ⇒ 547,779 cuts (size ≤ 49, area ≤ 432,062) ⇒ 42,685 splits.\n", | ||||
|       "49 states ⇒ 547,779 cuts (cut size ≤ 49, cut area ≤ 432,062) ⇒ 42,685 splits.\n", | ||||
|       "\n", | ||||
|       "1. Split that maximizes area(B):\n", | ||||
|       "A|1,345,558|43.122%|29|AL AR CT DE FL GA IN KS KY LA LP MA MD ME MS NC NH NJ NY OH OK PA RI SC TN TX VA VT WV\n", | ||||
| @ -434,8 +438,8 @@ | ||||
|       "B|1,344,149|43.077%|15|AZ CA IA ID MN MT ND NV OR SD UP UT WA WI WY\n", | ||||
|       "C|  430,653|13.801%| 5|CO IL MO NE NM\n", | ||||
|       "∆|    1,409| 0.045%|\n", | ||||
|       "CPU times: user 35.5 s, sys: 325 ms, total: 35.8 s\n", | ||||
|       "Wall time: 36 s\n" | ||||
|       "CPU times: user 36.3 s, sys: 437 ms, total: 36.8 s\n", | ||||
|       "Wall time: 37.3 s\n" | ||||
|      ] | ||||
|     } | ||||
|    ], | ||||
| @ -451,7 +455,7 @@ | ||||
|     "\n", | ||||
|     "# Improving on question 2\n", | ||||
|     "\n", | ||||
|     "The code we have so far is designed to generate small cuts. We can generate cuts up to 10 states in a minute or so, and 11 states in a couple of minutes (if we take some care with the borders):" | ||||
|     "For question 1, good solutions necessarily have a small cut, but for question 2 we might need a big cut. I'll try going up to 10 states:" | ||||
|    ] | ||||
|   }, | ||||
|   { | ||||
| @ -463,7 +467,7 @@ | ||||
|      "name": "stdout", | ||||
|      "output_type": "stream", | ||||
|      "text": [ | ||||
|       "49 states ⇒ 1,209,042 cuts (size ≤ 10, area ≤ 1,040,120) ⇒ 278,472 splits.\n", | ||||
|       "49 states ⇒ 1,180,878 cuts (cut size ≤ 10, cut area ≤ 1,000,000) ⇒ 266,863 splits.\n", | ||||
|       "\n", | ||||
|       "1. Split that maximizes area(B):\n", | ||||
|       "A|1,345,558|43.122%|29|AL AR CT DE FL GA IN KS KY LA LP MA MD ME MS NC NH NJ NY OH OK PA RI SC TN TX VA VT WV\n", | ||||
| @ -476,44 +480,13 @@ | ||||
|       "B|1,161,195|37.213%|10|AZ CA CO KS LA NM OR TX UT WA\n", | ||||
|       "C|  797,967|25.573%|10|AR ID KY MO MS MT NE NV OK WY\n", | ||||
|       "∆|        3| 0.000%|\n", | ||||
|       "CPU times: user 1min 3s, sys: 911 ms, total: 1min 4s\n", | ||||
|       "CPU times: user 1min 3s, sys: 1.05 s, total: 1min 4s\n", | ||||
|       "Wall time: 1min 4s\n" | ||||
|      ] | ||||
|     } | ||||
|    ], | ||||
|    "source": [ | ||||
|     "%time answers(usa49, maxsize=10, maxarea=area(usa49)//3)" | ||||
|    ] | ||||
|   }, | ||||
|   { | ||||
|    "cell_type": "code", | ||||
|    "execution_count": 13, | ||||
|    "metadata": {}, | ||||
|    "outputs": [ | ||||
|     { | ||||
|      "name": "stdout", | ||||
|      "output_type": "stream", | ||||
|      "text": [ | ||||
|       "49 states ⇒ 3,049,254 cuts (size ≤ 11, area ≤ 1,040,120) ⇒ 430,761 splits.\n", | ||||
|       "\n", | ||||
|       "1. Split that maximizes area(B):\n", | ||||
|       "A|1,345,558|43.122%|29|AL AR CT DE FL GA IN KS KY LA LP MA MD ME MS NC NH NJ NY OH OK PA RI SC TN TX VA VT WV\n", | ||||
|       "B|1,344,149|43.077%|15|AZ CA IA ID MN MT ND NV OR SD UP UT WA WI WY\n", | ||||
|       "C|  430,653|13.801%| 5|CO IL MO NE NM\n", | ||||
|       "∆|    1,409| 0.045%|\n", | ||||
|       "\n", | ||||
|       "2. Split that minimizes ∆ = area(A) - area(B):\n", | ||||
|       "A|1,161,198|37.214%|29|AL CT DE FL GA IA IL IN LP MA MD ME MN NC ND NH NJ NY OH PA RI SC SD TN UP VA VT WI WV\n", | ||||
|       "B|1,161,195|37.213%|10|AZ CA CO KS LA NM OR TX UT WA\n", | ||||
|       "C|  797,967|25.573%|10|AR ID KY MO MS MT NE NV OK WY\n", | ||||
|       "∆|        3| 0.000%|\n", | ||||
|       "CPU times: user 2min 39s, sys: 2.14 s, total: 2min 41s\n", | ||||
|       "Wall time: 2min 42s\n" | ||||
|      ] | ||||
|     } | ||||
|    ], | ||||
|    "source": [ | ||||
|     "%time answers(usa49, maxsize=11, maxarea=area(usa49)//3, start=states('MT ND WI IL IN'), end=south)" | ||||
|     "%time answers(usa49, maxsize=10, maxarea=1_000_000)" | ||||
|    ] | ||||
|   }, | ||||
|   { | ||||
| @ -524,16 +497,63 @@ | ||||
|     "\n", | ||||
|     "\n", | ||||
|     "\n", | ||||
|     "However, I'm pretty sure we can find two regions with exactly equal areas. I want to allow a cut that might have 30 or 40 states, so instead of enumerating cuts, I'll focus on enumerating the regions *A* and *B*. Given two iterables of regions, `find_equal` will report all pairs (one from *As* and one from *Bs*) of regions that have the exact same area and that are not overlapping: *A* is disjoint from *B* and all its neighbors (the `neighborhood` of *B*)." | ||||
|     "Could we can find two regions with exactly equal areas? It would be inefficient to enumerate all  cuts for much more than 10 states. Instead I'll create two lists of regions *As* and *Bs*. I can then determine if there are two regions of equal size in time proportional to the sum of their lengths (not the product) by creating a table of {area: region} entries for all the *As* and checking if any of the *Bs* have an area that is in the table. (We also need to check that the two regions are disjoint. They are likely to be so because I start one in the east and one in the west, but we still need to check.) Here are two lists of regions:" | ||||
|    ] | ||||
|   }, | ||||
|   { | ||||
|    "cell_type": "code", | ||||
|    "execution_count": 13, | ||||
|    "metadata": {}, | ||||
|    "outputs": [], | ||||
|    "source": [ | ||||
|     "make_regions = make_cuts # The function `make_cuts` can be used to make regions\n", | ||||
|     "\n", | ||||
|     "As = list(make_regions(usa49, 5, start=west, end=usa49))\n", | ||||
|     "Bs = list(make_regions(usa49, 7, start=east, end=usa49))" | ||||
|    ] | ||||
|   }, | ||||
|   { | ||||
|    "cell_type": "markdown", | ||||
|    "metadata": {}, | ||||
|    "source": [ | ||||
|     "How many regions are in each list?" | ||||
|    ] | ||||
|   }, | ||||
|   { | ||||
|    "cell_type": "code", | ||||
|    "execution_count": 14, | ||||
|    "metadata": {}, | ||||
|    "outputs": [ | ||||
|     { | ||||
|      "data": { | ||||
|       "text/plain": [ | ||||
|        "'There are 376 regions in As and 22971 regions in Bs and 8,637,096 pairs of regions'" | ||||
|       ] | ||||
|      }, | ||||
|      "execution_count": 14, | ||||
|      "metadata": {}, | ||||
|      "output_type": "execute_result" | ||||
|     } | ||||
|    ], | ||||
|    "source": [ | ||||
|     "a, b = len(As), len(Bs)\n", | ||||
|     "f'There are {a} regions in As and {b} regions in Bs and {a * b:,d} pairs of regions'" | ||||
|    ] | ||||
|   }, | ||||
|   { | ||||
|    "cell_type": "markdown", | ||||
|    "metadata": {}, | ||||
|    "source": [ | ||||
|     "If the average region area is about a million square miles, there's a pretty good chance that one of the 8 million pairs will consist of an *A* and a *B* with exactly equal areas. It's like buying 8 million lottery tickets with random  6-digit numbers. Let's check:" | ||||
|    ] | ||||
|   }, | ||||
|   { | ||||
|    "cell_type": "code", | ||||
|    "execution_count": 15, | ||||
|    "metadata": {}, | ||||
|    "outputs": [], | ||||
|    "source": [ | ||||
|     "def find_equal(As: Iterable[Region], Bs: Iterable[Region]) -> Iterator[Tuple[Region, Region, int]]:\n", | ||||
|     "def find_equal(As: List[Region], Bs: List[Region]) -> Iterator[Tuple[Region, Region, int]]:\n", | ||||
|     "    \"\"\"From As and Bs, find disjoint regions A and B that have the exact same area. Yield (A, B, area(B)).\"\"\"\n", | ||||
|     "    area_table = {area(A): A for A in As}\n", | ||||
|     "    for B in Bs:\n", | ||||
| @ -547,16 +567,9 @@ | ||||
|     "    return A | {s1 for s in A for s1 in neighbors[s]}" | ||||
|    ] | ||||
|   }, | ||||
|   { | ||||
|    "cell_type": "markdown", | ||||
|    "metadata": {}, | ||||
|    "source": [ | ||||
|     "I'll keep the regions small, and anchor *A* in the west and *B* in the east. If this experiment doesn't yield anything, I'll try larger regions:" | ||||
|    ] | ||||
|   }, | ||||
|   { | ||||
|    "cell_type": "code", | ||||
|    "execution_count": 15, | ||||
|    "execution_count": 16, | ||||
|    "metadata": {}, | ||||
|    "outputs": [ | ||||
|     { | ||||
| @ -565,33 +578,30 @@ | ||||
|        "[(frozenset({'CA', 'NV', 'UT'}),\n", | ||||
|        "  frozenset({'GA', 'IA', 'KY', 'MO', 'MS', 'TN', 'VA'}),\n", | ||||
|        "  359164),\n", | ||||
|        " (frozenset({'AZ', 'CA', 'OR'}),\n", | ||||
|        "  frozenset({'IA', 'IL', 'KY', 'NE', 'SD', 'VA', 'WV'}),\n", | ||||
|        "  376064),\n", | ||||
|        " (frozenset({'AZ', 'CA', 'OR'}),\n", | ||||
|        "  frozenset({'IL', 'IN', 'KS', 'MO', 'OH', 'TN', 'VA'}),\n", | ||||
|        "  376064),\n", | ||||
|        " (frozenset({'ID', 'OR', 'UT', 'WY'}),\n", | ||||
|        "  frozenset({'GA', 'IA', 'IL', 'IN', 'MO', 'TN', 'VA'}),\n", | ||||
|        "  364658),\n", | ||||
|        " (frozenset({'ID', 'NV', 'WA'}),\n", | ||||
|        "  frozenset({'KY', 'MS', 'NY', 'PA', 'TN', 'VT', 'WV'}),\n", | ||||
|        "  265439),\n", | ||||
|        " (frozenset({'ID', 'OR', 'UT', 'WY'}),\n", | ||||
|        "  frozenset({'GA', 'IA', 'IL', 'IN', 'MO', 'TN', 'VA'}),\n", | ||||
|        "  364658),\n", | ||||
|        " (frozenset({'AZ', 'CA', 'OR'}),\n", | ||||
|        "  frozenset({'IL', 'IN', 'KS', 'MO', 'OH', 'TN', 'VA'}),\n", | ||||
|        "  376064),\n", | ||||
|        " (frozenset({'ID', 'UT', 'WA', 'WY'}),\n", | ||||
|        "  frozenset({'AL', 'IL', 'IN', 'KY', 'TN', 'VA', 'WI'}),\n", | ||||
|        "  337577)]" | ||||
|        "  337577),\n", | ||||
|        " (frozenset({'AZ', 'CA', 'OR'}),\n", | ||||
|        "  frozenset({'IA', 'IL', 'KY', 'NE', 'SD', 'VA', 'WV'}),\n", | ||||
|        "  376064)]" | ||||
|       ] | ||||
|      }, | ||||
|      "execution_count": 15, | ||||
|      "execution_count": 16, | ||||
|      "metadata": {}, | ||||
|      "output_type": "execute_result" | ||||
|     } | ||||
|    ], | ||||
|    "source": [ | ||||
|     "make_regions = make_cuts # make_cuts actually just makes contiguous regions\n", | ||||
|     "\n", | ||||
|     "list(find_equal(make_regions(usa49, 4, start=west, end=usa49),\n", | ||||
|     "                make_regions(usa49, 7, start=east, end=usa49)))" | ||||
|     "list(find_equal(As, Bs))" | ||||
|    ] | ||||
|   }, | ||||
|   { | ||||
| @ -615,14 +625,14 @@ | ||||
|   }, | ||||
|   { | ||||
|    "cell_type": "code", | ||||
|    "execution_count": 16, | ||||
|    "execution_count": 17, | ||||
|    "metadata": {}, | ||||
|    "outputs": [ | ||||
|     { | ||||
|      "name": "stdout", | ||||
|      "output_type": "stream", | ||||
|      "text": [ | ||||
|       "50 states ⇒ 43,600 cuts (size ≤ 8, area ≤ 624,072) ⇒ 13,134 splits.\n", | ||||
|       "50 states ⇒ 43,600 cuts (cut size ≤ 8, cut area ≤ 624,072) ⇒ 13,134 splits.\n", | ||||
|       "\n", | ||||
|       "1. Split that maximizes area(B):\n", | ||||
|       "A|1,345,626|43.123%|30|AL AR CT DC DE FL GA IN KS KY LA LP MA MD ME MS NC NH NJ NY OH OK PA RI SC TN TX VA VT WV\n", | ||||
| @ -660,14 +670,14 @@ | ||||
|   }, | ||||
|   { | ||||
|    "cell_type": "code", | ||||
|    "execution_count": 17, | ||||
|    "execution_count": 18, | ||||
|    "metadata": {}, | ||||
|    "outputs": [ | ||||
|     { | ||||
|      "name": "stdout", | ||||
|      "output_type": "stream", | ||||
|      "text": [ | ||||
|       "48 states ⇒ 41,735 cuts (size ≤ 8, area ≤ 624,072) ⇒ 18,642 splits.\n", | ||||
|       "48 states ⇒ 41,735 cuts (cut size ≤ 8, cut area ≤ 624,072) ⇒ 18,642 splits.\n", | ||||
|       "\n", | ||||
|       "1. Split that maximizes area(B):\n", | ||||
|       "A|1,348,646|43.221%|30|AL CT DE FL GA IA IL IN KY MA MD ME MI MN MT NC ND NH NJ NY OH PA RI SC SD TN VA VT WI WV\n", | ||||
| @ -705,14 +715,14 @@ | ||||
|   }, | ||||
|   { | ||||
|    "cell_type": "code", | ||||
|    "execution_count": 18, | ||||
|    "execution_count": 19, | ||||
|    "metadata": {}, | ||||
|    "outputs": [ | ||||
|     { | ||||
|      "name": "stdout", | ||||
|      "output_type": "stream", | ||||
|      "text": [ | ||||
|       "49 states ⇒ 1,237 cuts (size ≤ 4, area ≤ 624,072) ⇒ 384 splits.\n", | ||||
|       "49 states ⇒ 1,237 cuts (cut size ≤ 4, cut area ≤ 624,072) ⇒ 384 splits.\n", | ||||
|       "\n", | ||||
|       "1. Split that maximizes area(B):\n", | ||||
|       "A|1,607,869|51.528%|18|AZ CA CO IA ID KS MN MT ND NE NV OR SD UP UT WA WI WY\n", | ||||
| @ -741,14 +751,14 @@ | ||||
|   }, | ||||
|   { | ||||
|    "cell_type": "code", | ||||
|    "execution_count": 19, | ||||
|    "execution_count": 20, | ||||
|    "metadata": {}, | ||||
|    "outputs": [ | ||||
|     { | ||||
|      "name": "stdout", | ||||
|      "output_type": "stream", | ||||
|      "text": [ | ||||
|       "49 states ⇒ 367 cuts (size ≤ 3, area ≤ 624,072) ⇒ 89 splits.\n", | ||||
|       "49 states ⇒ 367 cuts (cut size ≤ 3, cut area ≤ 624,072) ⇒ 89 splits.\n", | ||||
|       "\n", | ||||
|       "1. Split that maximizes area(B):\n", | ||||
|       "A|2,393,960|76.721%|42|AL AR CO CT DE FL GA IA IL IN KS KY LA LP MA MD ME MN MO MS MT NC ND NE NH NJ NM NY OH OK PA RI SC SD TN TX UP VA VT WI WV WY\n", | ||||
| @ -779,7 +789,7 @@ | ||||
|   }, | ||||
|   { | ||||
|    "cell_type": "code", | ||||
|    "execution_count": 20, | ||||
|    "execution_count": 21, | ||||
|    "metadata": {}, | ||||
|    "outputs": [ | ||||
|     { | ||||
| @ -788,7 +798,7 @@ | ||||
|        "'ok'" | ||||
|       ] | ||||
|      }, | ||||
|      "execution_count": 20, | ||||
|      "execution_count": 21, | ||||
|      "metadata": {}, | ||||
|      "output_type": "execute_result" | ||||
|     } | ||||
|  | ||||
		Loading…
	
	
			
			x
			
			
		
	
		Reference in New Issue
	
	Block a user