Section 29.7: Coalition Formation

"Alone I could lift the box or read the label, never both at once. Then the drone joined, and the arm, and suddenly we were a warehouse. The only argument left was who deserved which slice of the bonus."
A Rover That Found Its Team

Big Picture

Coalition formation is the algorithmic side of cooperative game theory: a population of self-interested agents partitions itself into teams so that the total value created is as large as possible, then each team divides its winnings in a way stable enough that no agent wants to walk out. Two problems sit underneath that one sentence. The first is which teams to form, a search over every way to partition the agents, and the number of partitions grows faster than exponentially, so exact optimization is hopeless past a few dozen agents and we lean on anytime algorithms that improve a running best and report how far from optimal they could possibly be. The second is how a formed team splits its gains, where the core and the Shapley value from Chapter 28 give us stability and fairness. This section turns both into runnable code and into a decentralized negotiation that needs no central optimizer at all.

A single agent is bounded by its own capabilities. A rover can carry but not inspect; a drone can survey but not lift; an arm can manipulate but cannot move. Many tasks demand a combination of capabilities that no one agent possesses, and the only way to attempt them is for agents to band together. A coalition is a group of agents that agree to act as one for some purpose and to share whatever they jointly earn. Coalition formation is the process by which a set of agents decides who teams with whom, and it is the natural successor to the negotiation of Section 29.6: negotiation settles terms between parties, while coalition formation decides which parties should be at the table in the first place. The subject is cooperative because agents share the value a team produces, but it stays strategic because each agent will only join a team, and stay in it, if its share beats what it could get elsewhere.

Figure 29.7.1: Coalition formation as a two-step pipeline. Left: three agents, each worth little alone (total $12$). Middle: the coalition-structure search enumerates every partition of the agents and scores each by the sum of its coalition values. Right: the best structure pairs the rover and arm into a complementary team and leaves the drone solo, reaching total value $17$, which beats both the grand coalition ($16$, dragged down by coordination overhead) and the all-singletons baseline ($12$). These are the exact numbers produced by Output 29.7.1.

1. The Two Problems Inside Coalition Formation Beginner

It pays to keep two questions apart, because they have different mathematics and different algorithms. The first is coalition structure generation: given the value every possible team could create, partition all the agents into disjoint coalitions so that the sum of the coalition values is as large as possible. This is an optimization over partitions, and it cares only about total welfare, not about who gets what. The second is value division: once a coalition has formed and earned its value, how do its members split that value so that the split is fair and so that no member, or no subgroup, has reason to defect. This is the payoff problem, and it is where the core and the Shapley value of Section 28.4 do their work. A coalition that produces enormous value but cannot agree on how to divide it will not hold together, so the two problems are coupled in practice even though we analyze them separately.

The whole setup rests on a characteristic function $v$, which assigns to every subset (coalition) $C \subseteq A$ of the agent set $A$ a number $v(C)$, the value that coalition can secure on its own, with $v(\varnothing) = 0$. The interesting case is when teaming helps: a coalition is worth more than the sum of its members acting alone, a property called superadditivity, $v(C_1 \cup C_2) \ge v(C_1) + v(C_2)$ for disjoint $C_1, C_2$. When that holds for every pair, the grand coalition of all agents is always at least as good as any split, and structure generation is trivial. Real systems are rarely so clean: communication overhead, congestion, and conflicting goals mean large teams can be worth less than the right smaller teams, and then finding the best structure becomes a genuine search, exactly the case the demo below is built to expose.

Key Insight: Welfare Decides the Teams, Stability Decides the Shares

Coalition structure generation and value division answer different questions and must not be conflated. Structure generation maximizes a single global number, the total value summed over all coalitions, and is indifferent to who profits. Value division then takes one coalition's earned value as fixed and distributes it so that the result is stable (the core: no subgroup can do better by leaving) and fair (the Shapley value: each agent gets its average marginal contribution). A team that maximizes welfare but cannot divide its winnings stably will dissolve, so a working coalition formation system must solve both, in that order.

2. The Combinatorial Explosion Intermediate

Coalition structure generation is hard for a brutally simple reason: there are too many structures to check. The number of ways to partition a set of $n$ agents into non-empty coalitions is the Bell number $B_n$, which satisfies the recurrence

$$B_{n+1} = \sum_{k=0}^{n} \binom{n}{k}\, B_k, \qquad B_0 = 1.$$

The Bell numbers grow faster than any exponential: $B_3 = 5$, $B_{10} = 115{,}975$, $B_{20} \approx 5.17 \times 10^{13}$, and $B_{50} \approx 1.86 \times 10^{47}$. A related bound makes the difficulty vivid: the number of coalition structures is $\Theta(n^n)$ and at least $\omega(n^{n/2})$, so even writing down every structure is impossible past a few dozen agents, let alone scoring each one. Worse, there are $2^n - 1$ possible coalitions, so even storing a full characteristic function is exponential, which is why real systems represent $v$ compactly (through capabilities, marginal-contribution nets, or a generative rule) rather than as a giant table.

Because exact search is infeasible at scale, coalition structure generation is solved with anytime algorithms: methods that maintain a best-structure-so-far, improve it as they run, and can be stopped at any moment to return that current best together with a guarantee of the form "the optimum is at most a factor $\beta$ better than what I am holding." The classic result here is that searching only a cleverly chosen small fraction of structures (those whose coalitions are all very large or all very small) already bounds the optimum within a known ratio, and the bound tightens as the search is allowed to continue. This is the same anytime-with-bounds philosophy this book applied to communication cost in Chapter 3: when the exact answer is out of reach, compute a feasible answer and a certificate of how far it might be from optimal.

Fun Note: The Grand Coalition Is Not Always the Dream Team

Intuition says "more hands, more value, so put everybody on one team." The Bell numbers exist precisely because that intuition is often wrong. Add coordination overhead, a shared bottleneck, or two agents who actively interfere, and the all-in grand coalition can underperform a tidy split. In the demo below the three-agent team is worth $16$, but pairing two agents and benching the third is worth $17$. The search is not bureaucracy; it is the only way to discover that the best team is sometimes a smaller team.

3. Searching the Structure Space, From Scratch Intermediate

The code below makes both problems concrete on a tiny three-agent example where exhaustive search is still feasible, so we can see the exact optimum. It defines a characteristic function over a rover, a robotic arm, and a drone whose capabilities complement one another, enumerates every partition of the three agents (all $B_3 = 5$ of them), scores each by the sum of its coalition values, and reports the best structure against the grand-coalition and all-singletons baselines. It then takes the winning coalition and divides its value by the Shapley value, the average marginal contribution of each member over all join orders, which is the canonical fair split from Section 28.4.

from itertools import combinations, permutations

agents = ("rover", "arm", "drone")

# Characteristic function: value a coalition can secure on its own.
# Pairs are worth more than their parts (complementary capabilities), but the
# full three-agent team pays a coordination overhead and is worth LESS than the
# best pair-plus-singleton split. The search must discover that.
v_table = {
    frozenset(): 0,
    frozenset(["rover"]): 4, frozenset(["arm"]): 3, frozenset(["drone"]): 5,
    frozenset(["rover", "arm"]): 12, frozenset(["rover", "drone"]): 11,
    frozenset(["arm", "drone"]): 10, frozenset(["rover", "arm", "drone"]): 16,
}
def v(coalition):
    return v_table[frozenset(coalition)]

# Coalition structure generation: enumerate every partition of the agent set.
def partitions(items):
    items = list(items)
    if not items:
        yield []; return
    first, rest = items[0], items[1:]
    for smaller in partitions(rest):
        for i, block in enumerate(smaller):                 # add 'first' to a block
            yield smaller[:i] + [[first] + block] + smaller[i + 1:]
        yield [[first]] + smaller                           # or 'first' starts a block

scored = sorted(((sum(v(b) for b in cs), cs) for cs in partitions(agents)),
                reverse=True)
best_val, best_cs = scored[0]

# Value division of the winning coalition by the Shapley value:
# average marginal contribution of each agent over all join orders.
def shapley(coalition):
    coalition = list(coalition); n = len(coalition)
    phi = {a: 0.0 for a in coalition}
    for order in permutations(coalition):
        seen = []
        for a in order:
            phi[a] += v(seen + [a]) - v(seen); seen.append(a)
    fact = 1
    for k in range(1, n + 1): fact *= k                     # n! orders
    return {a: phi[a] / fact for a in coalition}

Code 29.7.1: A from-scratch coalition formation core. partitions generates every coalition structure recursively, the sorted scoring picks the welfare-maximizing structure, and shapley divides the winning coalition's value fairly by averaging marginal contributions over all join orders. The full script also prints the ranked structures and the baselines shown below.

agents                     : 3
coalition structures (Bell): 5
all singletons   value     : 12
grand coalition  value     : 16
BEST structure   value     : 17
BEST structure             : {rover,arm} | {drone}

ranked coalition structures:
   17   {rover,arm} | {drone}
   16   {rover,arm,drone}
   14   {rover} | {arm,drone}
   14   {arm} | {rover,drone}
   12   {rover} | {arm} | {drone}

Shapley split of coalition {rover,arm} (value 12):
  rover  -> 6.50
  arm    -> 5.50
  sum of shares -> 12.00  (equals coalition value: efficiency)

Output 29.7.1: The welfare-maximizing structure pairs the rover and arm and leaves the drone alone, reaching total value $17$, ahead of the grand coalition ($16$) and the all-singletons baseline ($12$). The Shapley split of the winning pair gives the rover $6.5$ and the arm $5.5$, and the shares sum exactly to the coalition value of $12$, the efficiency property of the Shapley value.

Three observations carry the lesson. First, the best structure is genuinely not the grand coalition, so the search did real work: had we assumed superadditivity and put everyone on one team, we would have left value $1$ on the table. Second, the Shapley split is unequal ($6.5$ versus $5.5$) because the rover's marginal contributions across the join orders are larger than the arm's, and the shares still sum to the full $12$. Third, this exhaustive method is honest only because $n = 3$; at $n = 50$ the $1.86 \times 10^{47}$ structures from Section 2 forbid enumeration, and the anytime search of that section is the only option.

4. Distributed Coalition Formation Advanced

Code 29.7.1 assumes a single process that can see the entire characteristic function and enumerate every structure. In the multi-agent setting this book cares about, no such omniscient optimizer exists. Agents are separate processes on separate machines, each knowing only its own capabilities, the value of coalitions it could personally join, and whatever its neighbors tell it. Coalition formation then becomes a decentralized search: agents propose teams to one another, evaluate the value and the proposed split, accept or counter, and re-form when a better offer arrives, with the global structure emerging from many local agreements rather than from a central solver. This mirrors the negotiation protocols of Section 29.6 and the broader distribution-without-a-coordinator stance of Chapter 27, and it inherits their hard parts: agents have partial information, messages cost time, and an agent that commits to a coalition based on stale offers may regret it.

Decentralized formation buys robustness and scalability at the price of optimality. No agent ever materializes the full structure space, so the guarantees are local: protocols aim for structures that are stable (in or near the core, so no subgroup wants to break away) and reachable by local moves, not necessarily the global welfare optimum. The design questions are the familiar distributed-systems ones. How do agents discover potential partners without a global directory? How is a proposed coalition's value computed when its members hold private information? How do we prevent two agents from each believing they recruited the same third agent, a coordination race that consensus mechanisms in Section 29.9 are built to resolve? Coalition formation, seen this way, is one more instance of the book's spine: a global objective pursued by many machines exchanging only local information.

Practical Example: A Warehouse Robot Team That Re-Formed Itself

Who: A robotics engineer deploying a heterogeneous fleet in an order-fulfillment warehouse.

Situation: The fleet mixed wheeled rovers (fast transport), shelf-scanning drones (inventory reads), and arm units (pick and place), and incoming orders each needed some combination of those three capabilities.

Problem: A central planner that assigned every robot to every order recomputed from scratch whenever a robot's battery dipped or a new order arrived, and its solve time grew past the seconds the floor could tolerate as the fleet crossed a hundred units.

Dilemma: Keep the central optimizer, which found high-value teams but could not re-plan fast enough under churn, or switch to decentralized coalition formation, which re-planned locally and instantly but with only local guarantees on team quality.

Decision: They moved to decentralized formation: each order broadcast its required capabilities, nearby robots bid by advertising what they brought, and a coalition closed once its members covered the capability set with a Shapley-based split of the order's reward.

How: Robots negotiated peer-to-peer with a timeout; when a battery dropped or an order completed, only the affected coalitions re-formed, leaving the rest of the floor untouched, exactly the local-move dynamics of this section.

Result: Re-planning latency stayed flat as the fleet grew to several hundred units, and measured throughput came within a few percent of the old central optimum while surviving robot dropouts that would previously have stalled the whole plan.

Lesson: When the environment churns and the fleet is large, a decentralized search that re-forms only the affected coalitions beats a globally optimal planner that must re-solve everything. We return to fleets like this in the robotics case study of Chapter 39.

5. Overlapping Coalitions and Dynamic Re-Formation Advanced

Two assumptions in the basic model break in real deployments. The first is that coalitions are disjoint. An agent often has resources it can split across several teams at once: a communication relay drone can serve two delivery coalitions simultaneously, and a GPU can host fractions of several inference jobs. Overlapping coalition formation drops the partition constraint and lets an agent allocate fractions of its capacity to multiple coalitions, which enlarges the achievable value but also enlarges the search and complicates the payoff division, since an agent's share now depends on how it spread itself. The second broken assumption is that the world holds still. Environments change: tasks arrive and complete, agents fail or recharge, and the value of a team shifts as conditions move. Dynamic coalition formation treats structure as something that is continuously revised rather than computed once, re-forming teams as the characteristic function itself drifts, which is why the warehouse fleet above re-planned locally on every battery dip.

These extensions matter far beyond robotics. When an orchestrator assembles a crew of language-model agents for a complex task, a planner, a coder, a critic, and a retriever, it is forming a coalition whose combined capabilities cover the job, and as the task evolves it may add a specialist or drop an idle member, a dynamic re-formation we develop in Chapter 32. Ride-sharing and logistics are coalition formation at metropolitan scale: riders and drivers, or shipments and trucks, form short-lived value-sharing groups (a shared ride, a consolidated load) that dissolve on completion and re-form continuously as demand moves across the map. In every one of these, the two problems of Section 1 recur: decide the teams to maximize value, then divide the gains so the teams hold.

Thesis Thread: Cooperative Game Theory Becomes a Distributed Search

This section is where the static solution concepts of Chapter 28, the core and the Shapley value, turn into something a cluster of machines actually runs. Structure generation is a distributed optimization over a super-exponential space, attacked with the same anytime-with-bounds discipline the book used for communication cost; value division is the core and Shapley value computed, often approximately, across agents that hold private information. The grand coalition that an undergraduate game-theory course assumes is optimal becomes, at scale, just one point in a space too large to enumerate, and finding a better point with only local information is the multi-agent face of the book's spine. The same machinery reappears in cooperative multi-agent reinforcement learning, where agents must learn which teammates to value, in Chapter 30.

Research Frontier: LLM Agents Form Their Own Crews (2024 to 2026)

The liveliest current application of coalition formation is the automatic assembly of language-model agent teams. Frameworks in the lineage of AutoGen and CrewAI, and 2024 to 2025 work on automated multi-agent system design (for example AgentVerse-style and AFlow-style search over agent team compositions and graphs), treat "which agents, with which roles, should collaborate on this task" as a coalition-structure search, and several systems now learn the team composition rather than hand-specifying it. A parallel line studies Shapley-value credit assignment for multi-agent LLM pipelines, asking which agent's contribution actually moved the final answer so that compute and trust can be allocated fairly, the same value-division problem this section formalizes, now applied to tokens and tool calls instead of warehouse robots. Open questions are sharp: characteristic-function values are themselves noisy LLM evaluations, the structure space is the Bell number of available agents, and dynamic re-formation must happen mid-task as the problem reveals itself. We build the orchestration substrate for these crews in Chapter 32.

Library Shortcut: SciPy and shapley-value Tools Replace the Hand-Rolled Search

The partition enumeration and Shapley loop in Code 29.7.1 are textbook implementations meant to expose the mechanism. In practice the structure-generation step is an optimization you hand to a solver, and the value-division step is a one-call library. Casting coalition structure generation as a set-partitioning integer program lets a mature mixed-integer solver prune the super-exponential space far past what a naive enumerator reaches:

# Value division: exact Shapley value via an off-the-shelf computation.
from itertools import permutations
from math import factorial

def shapley_exact(players, v):                 # v: callable on a frozenset
    phi = {p: 0.0 for p in players}
    for order in permutations(players):
        seen = set()
        for p in order:
            phi[p] += v(seen | {p}) - v(seen)  # marginal contribution
            seen.add(p)
    return {p: phi[p] / factorial(len(players)) for p in players}

# Structure generation at scale: a set-partitioning MILP handed to a solver
# (pulp / OR-tools / SciPy milp), maximizing summed coalition value subject to
# every agent landing in exactly one chosen coalition. The solver's branch-and-
# bound supplies the same anytime bound discussed in Section 2, for free.

Code 29.7.2: The same two steps as Code 29.7.1, delegated. Exact Shapley computation collapses to a short reusable function, and structure generation becomes a set-partitioning MILP whose branch-and-bound gives the anytime optimality bound that Section 2 motivated, with the solver, not your code, pruning the Bell-number search space.

Coalition formation completes the cooperative half of multi-agent systems: agents decide which teams to form so total value is maximized, then divide each team's gains stably and fairly. What we have not yet asked is the dual question. Given the teams, or even given individual agents, exactly which task goes to which agent or coalition, and how do we assign them efficiently when there are many tasks and many agents at once? That is task allocation, and it is where Section 29.8 takes us next, carrying the value functions and the negotiation machinery of this section forward into the assignment problem.

Exercise 29.7.1: When the Grand Coalition Wins Conceptual

Using the characteristic function in Code 29.7.1, state the exact condition on $v(\{\text{rover}, \text{arm}, \text{drone}\})$ under which the grand coalition becomes the welfare-maximizing structure rather than the $\{\text{rover}, \text{arm}\} \mid \{\text{drone}\}$ split. Then explain in words why a superadditive characteristic function (every union worth at least the sum of its parts) makes coalition structure generation trivial, and what real-world effect (name one) destroys superadditivity and forces the search of Section 2.

Exercise 29.7.2: Scale the Search and the Split Coding

Extend Code 29.7.1 to $n = 6$ agents with a characteristic function of your choice (for instance, a coalition's value is the number of distinct capabilities its members jointly cover, minus a coordination penalty proportional to the squared coalition size). Print the Bell number $B_6$, confirm your partition generator produces exactly that many structures, report the best structure and its value against the grand-coalition and all-singletons baselines, and compute the Shapley split of the winning coalition. Then time the run and argue, from the Bell-number growth in Section 2, at roughly what $n$ this exhaustive approach stops being usable on your machine.

Exercise 29.7.3: Is the Split in the Core? Analysis

For the winning coalition $\{\text{rover}, \text{arm}\}$ with value $12$ and the Shapley split $(6.5, 5.5)$ from Output 29.7.1, check whether the split lies in the core: verify that no member can do strictly better by leaving (compare each member's share to its singleton value), and state the inequalities the core requires for a two-agent coalition. Then construct a small three-agent characteristic function whose grand coalition has an empty core (no stable division exists), and explain what an empty core implies for whether that grand coalition can hold together at all, connecting your answer to the stability arguments of Section 28.4.