<div align="right"><i>Peter Norvig<br>Sept 2018</i></div>

# Scheduling a Doubles Pickleball Tournament

My friend Steve asked for help in creating a schedule for a round-robin doubles pickleball tournament with 8 or 9 players on 2 courts. [Pickleball](https://en.wikipedia.org/wiki/Pickleball) is a tennis-like game played on a smaller court. In this type of tournament a player plays with a different partner in each game. To be precise:

> Given *P* players and *C* available courts, create a **schedule**: a list of **rounds** of play, where each round consists of up to *C* **games** played simultaneously. Each game pits one **pair** of players against another pair. The **criteria** for a schedule are:
>
> 1. A player cannot be scheduled to play twice in the same round.
2. Each player should partner with each other player once (or as close to that as possible).
2. Each player should play against each other player twice (or as close to that as possible).
4. Each court should be filled each round (or as close to that as possible); in other words, fewer rounds are better.


# Imports and Vocabulary

Let's start with some imports and some choices for basic types:

In [1]:
from collections import Counter
from itertools   import combinations
from typing      import List, Tuple, Set
import random
import math
random.seed(42)

Player   = int   # A player is an integer: 1
Pair     = tuple # A pair is a tuple of two players who are partners: (1, 2)
Game     = list  # A game is a list of two pairs of players: [(1, 2), (3, 4)]
Round    = list  # A round is a list of games that happeen at once: [[(1, 2), (3, 4)], [(5, 6), (7, 8)]]
Schedule = list  # A schedule is a list of rounds that happen one after the other.

Here's a schedule for *P*=8 players on *C*=2 courts. It says that in the first round, players 1 and 2 partner against 3 and 4 on one court, while 5 and 6 partner against 7 and 8 on the other court.

    Round  1: | 1,2 vs 3,4 | 5,6 vs 7,8 |
    Round  2: | 1,3 vs 2,4 | 5,7 vs 6,8 |
    Round  3: | 1,4 vs 2,3 | 5,8 vs 6,7 |
    Round  4: | 1,5 vs 2,6 | 3,7 vs 4,8 |
    Round  5: | 1,6 vs 2,5 | 3,8 vs 4,7 |
    Round  6: | 1,7 vs 2,8 | 3,5 vs 4,6 |
    Round  7: | 1,8 vs 2,7 | 3,6 vs 4,5 |
    
This schedule is optimal according to criteria 1, 2, and 4, but it is not optimal in terms of number of times playing each opponent; for example players 1 and 2 play each other 6 times. We will see if we can do better. 

# Tournament Scheduling Algorithm

Here is a strategy to create a schedule:

- Call `all_pairs(P)` to create a list of player pairs in which each pair appears exactly once (for criterion 2).
- Call `games_from_pairs` to take these pairs and put them together into a list of games (heeding criterion 1).
- Call `schedule_games` to take the list of games and put them into rounds with up to *C* games played at the same time (heeding criterion 1 again).
- This approach might not not completely satisfy criteria 3 and 4, but we'll worry about that alter. 

Here's the code:

In [2]:
def tournament(P: int, C=2) -> Schedule:
    """Schedule games for a round-robin tournament for P players on C courts."""
    return schedule_games(games_from_pairs(all_pairs(P)), C)

# Generating Pairs of Players

Each player should partner with each other player once. `all_pairs` produces that list of partners:

In [3]:
def all_pairs(P: int) -> List[Pair]: 
    """All ways in which two out of P players can partner."""
    return list(combinations(range(1, P + 1), 2))

In [4]:
all_pairs(4)

[(1, 2), (1, 3), (1, 4), (2, 3), (2, 4), (3, 4)]

In [5]:
all_pairs(6)

[(1, 2),
 (1, 3),
 (1, 4),
 (1, 5),
 (1, 6),
 (2, 3),
 (2, 4),
 (2, 5),
 (2, 6),
 (3, 4),
 (3, 5),
 (3, 6),
 (4, 5),
 (4, 6),
 (5, 6)]

The astute reader may have noticed that `all_pairs(6)` has 15 pairs, an odd number that cannot be evenly combined into games. We must drop one of the pairs, meaning that two players will never partner with each other, and will end up playing one less game than everyone else. Since there are *P* &times; (*P*-1) / 2 pairs of *P* players, that means there are *P* &times; (*P*-1) / 4 games, which is a whole number when either *P* or *P*-1 is divisible by 4.

# Placing Pairs into Games


To place pairs into games, we'll choose one pair, `pair1`, to play against another pair `pair2`, making sure that between the two pairs there are four different players. Then we'll try to make `other_games` out of the remaining pairs. If we can't, we'll make a different choice for `pair2`. 

In [6]:
def games_from_pairs(pairs) -> List[Game]:
    "Combine pairs of players into a list of games."
    if len(pairs) < 2:
        return []
    pair1 = pairs[0]
    for pair2 in pairs:
        if len(set(pair1 + pair2)) == 4:
            game = [pair1, pair2]
            other_games = games_from_pairs([p for p in pairs if p != pair1 and p != pair2])
            if other_games is not None:
                return [game, *other_games]
    return None

In [7]:
games_from_pairs(all_pairs(6))

[[(1, 2), (3, 4)],
 [(1, 3), (2, 4)],
 [(1, 4), (2, 3)],
 [(1, 5), (2, 6)],
 [(1, 6), (2, 5)],
 [(3, 5), (4, 6)],
 [(3, 6), (4, 5)]]

In [8]:
games_from_pairs(all_pairs(8))

[[(1, 2), (3, 4)],
 [(1, 3), (2, 4)],
 [(1, 4), (2, 3)],
 [(1, 5), (2, 6)],
 [(1, 6), (2, 5)],
 [(1, 7), (2, 8)],
 [(1, 8), (2, 7)],
 [(3, 5), (4, 6)],
 [(3, 6), (4, 5)],
 [(3, 7), (4, 8)],
 [(3, 8), (4, 7)],
 [(5, 6), (7, 8)],
 [(5, 7), (6, 8)],
 [(5, 8), (6, 7)]]

That looks good. 

# Scheduling Games into Rounds on Courts

Now we need to take the games and schedule them such that no player plays twice in any round, and we take as few rounds as possible. We'll define  `schedule_games(games, C)` to do this using a greedy approach where we start with an empty schedule (with no rounds), and each game is assigned to the first round where it fits, or if it doesn't fit in any existing round,  add a new round. This does *not* guarantee the shortest possible schedule.

In [9]:
def schedule_games(games, C=2) -> Schedule:
    "Schedule games onto courts in rounds."
    schedule = Schedule() # Start with an empty schedule
    for game in games:
        round = first(round for round in schedule 
                      if len(round) < C and not (players(round) & players(game)))
        if not round: # Add new round
            round = Round()
            schedule.append(round)
        round.append(game)
    return schedule

def players(x) -> Set[Player]:
    "The set of players in a Pair, Game, Round, or Schedule."
    return set(x) if isinstance(x, Pair) else set().union(*map(players, x))

def first(iterable): return next(iter(iterable), None)

Remember that  `tournament(P, C)` does `schedule_games(games_from_pairs(all_pairs(P)), C)`.

In [10]:
tournament(6, 1)

[[[(1, 2), (3, 4)]],
 [[(1, 3), (2, 4)]],
 [[(1, 4), (2, 3)]],
 [[(1, 5), (2, 6)]],
 [[(1, 6), (2, 5)]],
 [[(3, 5), (4, 6)]],
 [[(3, 6), (4, 5)]]]

In [11]:
tournament(8, 2)

[[[(1, 2), (3, 4)], [(5, 6), (7, 8)]],
 [[(1, 3), (2, 4)], [(5, 7), (6, 8)]],
 [[(1, 4), (2, 3)], [(5, 8), (6, 7)]],
 [[(1, 5), (2, 6)], [(3, 7), (4, 8)]],
 [[(1, 6), (2, 5)], [(3, 8), (4, 7)]],
 [[(1, 7), (2, 8)], [(3, 5), (4, 6)]],
 [[(1, 8), (2, 7)], [(3, 6), (4, 5)]]]

Are these good schedules? I'll need some analysis to tell.

# Visualizing a Schedule

I'll define a function, `report` to make it easier to see what is going on:

In [12]:
def report(sched):
    "Print information about this schedule."
    for i, round in enumerate(sched, 1):
        print(f'Round {i:2}: | {str_from_round(round)} |')
    games = sum(sched, [])
    people = sorted(players(sched))
    P = len(people)
    opp = opponent_counts(games)
    fmt = ('{:2}|' + P * ' {}' + '   {:g}').format
    print('\nNumber of times each player plays against each opponent:\n')
    print('  |', *map(name, people), ' Games')
    print('--+' + '--' * P + '  -----')
    for row in people:
        counts = [opp[pairing(row, col)] for col in people]
        print(fmt(name(row), *[c or '-' for c in counts], sum(counts) / 2))
    print('\nSummary of counts in table:', 
          '; '.join(f'{t}: {c}' for t, c in Counter(opp.values()).most_common()))
        
def str_from_round(round) -> str:
    "A string representing a round of games."
    return ' | '.join(f'{name(a)},{name(b)} vs {name(c)},{name(d)}'
                      for [(a, b), (c, d)] in round)
        
def opponent_counts(games) -> Counter:
    "A Counter of {(player, opponent): times_played}."
    return Counter(pairing(p1, p2) for A, B in games for p1 in A for p2 in B)

def name(player) -> str:
    """A one-character string representing the player."""
    return '0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ'[player]

def pairing(p1, p2) -> frozenset: return tuple(sorted((p1, p2)))

In [13]:
report(tournament(8, 2))

Round  1: | 1,2 vs 3,4 | 5,6 vs 7,8 |
Round  2: | 1,3 vs 2,4 | 5,7 vs 6,8 |
Round  3: | 1,4 vs 2,3 | 5,8 vs 6,7 |
Round  4: | 1,5 vs 2,6 | 3,7 vs 4,8 |
Round  5: | 1,6 vs 2,5 | 3,8 vs 4,7 |
Round  6: | 1,7 vs 2,8 | 3,5 vs 4,6 |
Round  7: | 1,8 vs 2,7 | 3,6 vs 4,5 |

Number of times each player plays against each opponent:

  | 1 2 3 4 5 6 7 8  Games
--+----------------  -----
1 | - 6 2 2 1 1 1 1   7
2 | 6 - 2 2 1 1 1 1   7
3 | 2 2 - 6 1 1 1 1   7
4 | 2 2 6 - 1 1 1 1   7
5 | 1 1 1 1 - 6 2 2   7
6 | 1 1 1 1 6 - 2 2   7
7 | 1 1 1 1 2 2 - 6   7
8 | 1 1 1 1 2 2 6 -   7

Summary of counts in table: 1: 16; 2: 8; 6: 4


That looks decentâ€”we fill both courts on every round, and every player plays 7 games. But the opponents are not evenly distributed. For example, player 1 plays against player 2 in 6 of the 7 games, and only plays against players 5 through 8 in one game each.


# Improving a Schedule through Hillclimbing

How can I improve on criteria 3 and 4 (having each player play each opponent twice, and minimizing the number of rounds)? My strategy will be to start with a non-optimal list of games, and randomly pick a pair from one game and swap them with a pair from another game. Then see if a better schedule can be made with this altered list of games. If it can, keep the altered games; if not, revert the swap.  This is called a **hillclimbing** approach; the analogy is that we start out in a valley, take a step in a random direction, and if that is upward, keep going, otherwise step back and try again. Eventually you reach a peak (although it may not be the global peak).

I measure "better schedule" both in terms of minimal variation from the optimal distribution of opponents (as measured by `total_difference(games)`) and in terms of the number of rounds (as measured by `len(sched)`). I keep track of both a list of `games` and a complete schedule, `sched`. I make random changes to the `games`, and then re-schedule the games after each change. By default, I allot 200,000 tries of the hillclimbing process, but I can exit early if an optimal schedule is found.

In [14]:
def tournament2(P, C=2, tries=200_000):
    "Schedule games for P players on C courts by randomly swapping game opponents N times."
    pairs = all_pairs(P)
    games = games_from_pairs(pairs)
    sched = schedule_games(games, C)
    diff  = total_difference(games)
    for _ in range(tries):
        if is_optimal(sched, diff, P, C):
            return sched
        # Randomly swap pairs from two games
        ((i, j), _) = idx = random.sample(range(len(games)), 2), (side(), side())
        swap(games, idx)
        diff2 = total_difference(games)
        # Keep the swap if better (or same); revert if worse
        if (diff2 <= diff and len(players(games[i])) == 4 == len(players(games[j]))
            and len(schedule_games(games, C)) <= len(sched)):
                sched, diff = schedule_games(games, C), diff2
        else:
            swap(games, idx) # Swap them back
    return sched

def side() -> int: "Random side of the net"; return random.choice((0, 1))

def swap(games, idx):
    "Swap the pair at games[g1][s1] with the pair at games[g2][s2]."
    (g1, g2), (s1, s2) = idx
    games[g1][s1], games[g2][s2] = games[g2][s2], games[g1][s1]

def total_difference(games, optimal=2):
    "The total difference from an optimal distribution of opponents."
    return sum(abs(count - optimal) ** 3 
               for count in opponent_counts(games).values())

def is_optimal(schedule, diff, P, C) -> bool:
    """Is this schedule with this diff count optimal for P players on C courts?"""
    return diff == 0 and len(schedule) == math.ceil(P * (P - 1) / 4 / C)

I'll check that this works for a simple 4-player tournament:

In [15]:
report(tournament2(4, 1, 10))

Round  1: | 1,2 vs 3,4 |
Round  2: | 1,3 vs 2,4 |
Round  3: | 1,4 vs 2,3 |

Number of times each player plays against each opponent:

  | 1 2 3 4  Games
--+--------  -----
1 | - 2 2 2   3
2 | 2 - 2 2   3
3 | 2 2 - 2   3
4 | 2 2 2 -   3

Summary of counts in table: 2: 6


# 8 Player Tournament

Let's try to answer Steve's request for an 8-player tournament:

In [16]:
report(tournament2(8, 2))

Round  1: | 6,7 vs 3,4 | 1,2 vs 5,8 |
Round  2: | 5,6 vs 7,8 | 2,4 vs 1,3 |
Round  3: | 1,4 vs 5,7 | 2,3 vs 6,8 |
Round  4: | 3,8 vs 1,5 | 4,7 vs 2,6 |
Round  5: | 3,7 vs 4,8 | 1,6 vs 2,5 |
Round  6: | 3,6 vs 1,7 | 4,5 vs 2,8 |
Round  7: | 3,5 vs 2,7 | 4,6 vs 1,8 |

Number of times each player plays against each opponent:

  | 1 2 3 4 5 6 7 8  Games
--+----------------  -----
1 | - 2 2 2 3 2 1 2   7
2 | 2 - 2 2 3 2 1 2   7
3 | 2 2 - 2 1 2 3 2   7
4 | 2 2 2 - 1 2 3 2   7
5 | 3 3 1 1 - 1 2 3   7
6 | 2 2 2 2 1 - 3 2   7
7 | 1 1 3 3 2 3 - 1   7
8 | 2 2 2 2 3 2 1 -   7

Summary of counts in table: 2: 16; 3: 6; 1: 6


That's  good, but not perfect. In a previous run I was luckier and achieved a perfect schedule  (where every player plays each opponent exactly twice): 

In [17]:
perfect8: Schedule = [
 [[(1, 6), (2, 4)], [(3, 5), (7, 8)]],
 [[(1, 5), (3, 6)], [(2, 8), (4, 7)]],
 [[(2, 3), (6, 8)], [(4, 5), (1, 7)]],
 [[(4, 6), (3, 7)], [(1, 2), (5, 8)]],
 [[(1, 8), (6, 7)], [(3, 4), (2, 5)]],
 [[(2, 6), (5, 7)], [(1, 4), (3, 8)]],
 [[(2, 7), (1, 3)], [(4, 8), (5, 6)]], 
]

report(perfect8)

Round  1: | 1,6 vs 2,4 | 3,5 vs 7,8 |
Round  2: | 1,5 vs 3,6 | 2,8 vs 4,7 |
Round  3: | 2,3 vs 6,8 | 4,5 vs 1,7 |
Round  4: | 4,6 vs 3,7 | 1,2 vs 5,8 |
Round  5: | 1,8 vs 6,7 | 3,4 vs 2,5 |
Round  6: | 2,6 vs 5,7 | 1,4 vs 3,8 |
Round  7: | 2,7 vs 1,3 | 4,8 vs 5,6 |

Number of times each player plays against each opponent:

  | 1 2 3 4 5 6 7 8  Games
--+----------------  -----
1 | - 2 2 2 2 2 2 2   7
2 | 2 - 2 2 2 2 2 2   7
3 | 2 2 - 2 2 2 2 2   7
4 | 2 2 2 - 2 2 2 2   7
5 | 2 2 2 2 - 2 2 2   7
6 | 2 2 2 2 2 - 2 2   7
7 | 2 2 2 2 2 2 - 2   7
8 | 2 2 2 2 2 2 2 -   7

Summary of counts in table: 2: 28


# 9 Player Tournament

For 9 players, I can fit the 18 games into 9 rounds, but some players play each other 1 or 3 times. I'll report the results of a previous run:

In [18]:
previous9: Schedule = [
 [[(1, 7), (4, 9)], [(3, 5), (2, 6)]],
 [[(2, 7), (1, 3)], [(4, 8), (6, 9)]],
 [[(5, 9), (1, 6)], [(7, 8), (3, 4)]],
 [[(7, 9), (5, 8)], [(1, 2), (4, 6)]],
 [[(3, 8), (1, 5)], [(2, 9), (6, 7)]],
 [[(1, 4), (2, 5)], [(3, 6), (8, 9)]],
 [[(5, 6), (4, 7)], [(1, 8), (2, 3)]],
 [[(1, 9), (3, 7)], [(2, 8), (4, 5)]],
 [[(3, 9), (2, 4)], [(6, 8), (5, 7)]]
]

report(previous9)

Round  1: | 1,7 vs 4,9 | 3,5 vs 2,6 |
Round  2: | 2,7 vs 1,3 | 4,8 vs 6,9 |
Round  3: | 5,9 vs 1,6 | 7,8 vs 3,4 |
Round  4: | 7,9 vs 5,8 | 1,2 vs 4,6 |
Round  5: | 3,8 vs 1,5 | 2,9 vs 6,7 |
Round  6: | 1,4 vs 2,5 | 3,6 vs 8,9 |
Round  7: | 5,6 vs 4,7 | 1,8 vs 2,3 |
Round  8: | 1,9 vs 3,7 | 2,8 vs 4,5 |
Round  9: | 3,9 vs 2,4 | 6,8 vs 5,7 |

Number of times each player plays against each opponent:

  | 1 2 3 4 5 6 7 8 9  Games
--+------------------  -----
1 | - 3 3 2 2 1 2 1 2   8
2 | 3 - 3 3 2 2 1 1 1   8
3 | 3 3 - 1 1 1 2 3 2   8
4 | 2 3 1 - 2 2 2 2 2   8
5 | 2 2 1 2 - 3 2 3 1   8
6 | 1 2 1 2 3 - 2 2 3   8
7 | 2 1 2 2 2 2 - 2 3   8
8 | 1 1 3 2 3 2 2 - 2   8
9 | 2 1 2 2 1 3 3 2 -   8

Summary of counts in table: 2: 18; 3: 9; 1: 9


# 16 Player Tournament

Let's jump to 16 players on 4 courts:

In [19]:
%time report(tournament2(P=16, C=4))

Round  1: | 1,2 vs 3,4 | 5,6 vs 7,8 | D,E vs F,G | 9,A vs B,C |
Round  2: | 1,3 vs 2,4 | 5,7 vs A,G | C,E vs 6,8 | B,F vs 9,D |
Round  3: | 1,4 vs 2,3 | C,G vs 6,7 | 5,8 vs 9,F | A,E vs B,D |
Round  4: | A,D vs 2,6 | 1,5 vs 4,7 | 3,E vs 8,9 | B,G vs C,F |
Round  5: | 1,6 vs 2,5 | D,F vs 4,8 | 9,B vs 3,7 | A,C vs E,G |
Round  6: | 9,G vs 2,8 | 3,5 vs C,D | 4,6 vs A,F | B,E vs 1,7 |
Round  7: | 9,C vs 2,7 | D,G vs 4,5 | 1,8 vs A,B | 3,6 vs E,F |
Round  8: | 1,9 vs 2,A | 3,G vs 8,B | 5,D vs 6,E | 7,C vs 4,F |
Round  9: | 4,C vs 2,9 | 6,G vs 1,A | 7,F vs 5,E | 3,B vs 8,D |
Round 10: | 6,F vs 3,9 | 2,C vs 5,G | 7,D vs 1,B | 4,A vs 8,E |
Round 11: | 5,F vs 2,B | 3,A vs 1,C | 8,G vs 6,D | 4,9 vs 7,E |
Round 12: | 6,B vs 2,E | 5,C vs 1,D | 3,F vs 4,G | 7,9 vs 8,A |
Round 13: | 8,F vs 2,D | 3,C vs 6,A | 5,9 vs 4,B | 7,G vs 1,E |
Round 14: | 6,9 vs 1,F | 5,A vs 4,E | 2,G vs 3,D | 7,B vs 8,C |
Round 15: | 5,B vs 2,F | 1,G vs 9,E | 6,C vs 4,D | 7,A vs 3,8 |

Number of times each player plays again

That's a pretty good schedule! It takes the minimum 15 rounds, and although not all counts are 2, most are in the 1 to 3 range.

# Addendum: Counting Schedules

A reader asked "*couldn't you have tried all possible schedules?*" That's a great question! As [Ken Thompson says](https://users.ece.utexas.edu/~adnan/pike.html), "when in doubt, use brute force." How many possible schedules are there? 

- Assume a schedule with *R* rounds on *C* courts, with every court filled on every round.
- That means there are *G* = *CR* games and 2*G* slots in the schedule for pairs to fill.
- We can fill those slots with pairs in (2*G*)! ways.
- But that over-counts, because order doesn't matter in the following three ways:
  - The order of pairs within a game doesn't matter, so divide by 2*<sup>G</sup>*.
  - The order of games within a round doesn't matter, so divide by *C*!*<sup>R</sup>*.
  - The order of rounds within the schedule doesn't matter, so divide by *R*!.

That gives us:

In [20]:
from math import factorial

def count_schedules(P):
    "Number of possible schedules for P players with all courts full."
    G = P * (P - 1) // 4 # Number of games
    C = P // 4           # Number of courts
    R = G // C           # Number of rounds
    return factorial(2 * G) / (2 ** G * factorial(C) ** R * factorial(R))

{P: count_schedules(P) 
 for P in range(4, 18) if P % 4 in (0, 1)}

{4: 15.0,
 5: 945.0,
 8: 2.8845653137679503e+19,
 9: 7.637693625347175e+27,
 12: 4.375874524269406e+66,
 13: 2.5327611850776306e+83,
 16: 8.78872489906208e+147,
 17: 1.1985831550364023e+174}

We see that it would have been infeasible to try every schedule, even for *P*=8, let alone 9 or 16.