Section 27.7: Coordination and Cooperation

"We all wanted to cross the intersection at once, so we held a meeting about it. The meeting is still going. The intersection is empty."
An Agent Stuck Waiting on a Lock

Big Picture

When many agents share a world, the central problem is no longer how each one thinks but how their actions fit together: coordination keeps them from interfering, and cooperation steers them toward a goal none could reach alone. An agent acting alone needs only a good plan. The moment a second agent shares its environment, every action becomes contingent on what the others do, and the agents face a choice with a cost on each side. They can talk constantly, negotiating each shared decision at the price of communication that grows with the group, or they can agree in advance on rules and roles that let them act without talking at all. This section lays out the mechanisms that span that range, from organizational structure and social conventions to partial global planning and joint intentions, and shows in one runnable simulation that a shared convention can erase nearly all the conflict and all the messages a naive negotiation would incur.

The previous sections of this chapter built agents that solve pieces of a problem: blackboard systems that pool partial results, the contract-net protocol that hands tasks to the best bidder, and distributed constraint optimization that searches for a joint assignment. Each assumed the agents already wanted to work together and asked how they should organize the work. This section steps back to the mechanism underneath all of them. Coordination is the act of organizing actions so that agents do not interfere and collectively make progress; cooperation is coordination aimed at a goal the agents share. Both are needed because a distributed intelligence, like a distributed computation, is only as good as the way its parts combine, and combining them is never free.

1. Three Ways to Coordinate Beginner

There are three broad mechanisms by which agents organize their actions, and they differ mostly in how much they decide ahead of time versus at run time. The first is organizational structure: predefined roles and lines of authority that fix, before the agents ever act, who does what and who defers to whom. A manager agent assigns subtasks; worker agents accept them; nobody negotiates the chain of command because it was settled at design time. The second is conventions, also called social laws: rules that every agent follows so that conflict is avoided cheaply, without any agent having to ask another what it intends. Driving on the right is the canonical example; no two drivers exchange messages about which side to use, because the rule already answers the question. The third is explicit coordination through planning and communication: agents reason about one another's plans and exchange messages to resolve conflicts that the structure and the conventions did not anticipate.

These mechanisms sit on a spectrum of how much they pre-commit. Organizational structure and conventions push the coordination cost to design time, so run-time behavior is cheap but rigid. Explicit planning and communication keep the cost at run time, so behavior is flexible but expensive in messages and latency. Real multi-agent systems blend all three, exactly as a distributed-systems engineer blends static configuration with dynamic consensus, and the engineering question is always which decisions to freeze into structure and which to leave for the agents to negotiate live. Figure 27.7.1 contrasts the two extremes.

Figure 27.7.1: Two ways four agents share one intersection cell. On the left, with no convention, every contender broadcasts a claim to every other (the dashed orange messages) and ties still cause collisions (the spark). On the right, a fixed priority law $N \succ E \succ S \succ W$ is known to all in advance, so one agent crosses at a time with no messages exchanged and no collision possible. The simulation in Code 27.7.1 measures both regimes.

Key Insight: A Convention Is Precomputed Coordination

A social convention is coordination paid for once, at design time, and then reused for free at run time. Instead of agents discovering who yields by exchanging messages every time they meet, a shared rule answers the question before it is asked. The rule trades flexibility for cost: it cannot adapt to situations its designers did not foresee, but within its scope it removes both the communication and the risk of conflict entirely. Every coordination mechanism in this section is a different point on that same trade between what you decide in advance and what you decide live.

2. Partial Global Planning Intermediate

When agents must coordinate plans rather than single actions, explicit coordination needs a representation that no single agent owns the whole of. Partial global planning is the classic answer. Each agent builds a local plan for its own goals, then shares an abstracted view of that plan with the others; agents merge the shared fragments into a partial picture of what the group as a whole intends, detect where their plans conflict or where one could help another, and adjust locally in response. No agent ever holds the complete global plan, which would be too large and too volatile to maintain, and no central planner is required. The word partial is load-bearing: the shared plan is always an incomplete, continually revised approximation, assembled from pieces, much as a distributed system maintains an eventually consistent view rather than a single authoritative state.

This is the same structural idea as the blackboard of Section 27.4 and the constraint network of Section 27.6, applied to plans over time rather than to hypotheses or assignments. Agents publish what they intend, read what others intend, and locally reconcile. The cost is communication proportional to how much plan detail they share and how often plans change, which is exactly why agents share abstractions of their plans rather than every step: sending less detail is the planning analogue of sending fewer or smaller gradients in distributed optimization. We make that cost concrete next, because it is the hinge on which every coordination decision turns.

3. The Cost of Coordination Intermediate

Coordination is never free, and its cost can swallow its benefit. Suppose a task would take time $T_1$ on a single agent and is split across $n$ agents. The useful work each does is at best $T_1 / n$, but the agents must also spend time communicating and reconciling, an overhead $C(n)$ that typically grows with the group size as agents exchange more messages with more peers. The completion time looks like

$$T(n) \;=\; \frac{T_1}{n} \;+\; C(n),$$

and because $C(n)$ rises while $T_1 / n$ falls, there is a group size beyond which adding agents makes the group slower. This is the multi-agent face of the same ceiling we met for parallel computation in Section 3.5: Amdahl's law says the unparallelizable fraction of the work, here the coordination, bounds the speedup no matter how many agents you add. A swarm that spends more time deciding who does what than doing it has crossed that line. The discipline is to keep $C(n)$ small, and the cheapest way to keep it small is to not communicate at all when a convention can decide the matter.

Thesis Thread: Coordination Overhead Is the Communication Tax, Again

From the first chapter of this book, the recurring tax on scaling out has been communication: the all-reduce that synchronizes gradients, the shuffle that moves keys, the consensus round that commits a value. Coordination among agents is the same tax wearing a different coat. The term $C(n)$ above is to a multi-agent system what the all-reduce time is to data-parallel training: a cost that grows with the number of participants and eventually caps the benefit of adding more. Every technique in this section, conventions, roles, plan abstraction, is a way to shrink $C(n)$, just as gradient compression and local updates shrink the all-reduce. The intelligence is distributed, but the tax is the one we have been paying all along.

4. Conventions Replace Communication Intermediate

The sharpest lever on $C(n)$ is a convention, because a shared rule replaces a negotiation. When two agents both want the one resource and a convention says who yields, they exchange zero messages where a negotiation would have exchanged several; multiply that saving across every encounter and across a growing group, and the convention turns a communication term that scaled with the group into a constant. This is precisely the insight that drives communication-efficient distributed optimization in Section 10.7, where workers take several local steps under a shared update rule instead of synchronizing every step: in both cases a rule agreed in advance lets the participants act in coordination without a round trip. A convention is to coordination what a local-update schedule is to optimization, the same move to cut communication rounds, made in a different language.

The simulation below makes the saving quantitative. Four agents share a single intersection cell in a gridworld, each wanting to cross to the far side. In the regime without a convention, every agent that wants the cell on a given tick broadcasts a claim to every other contender (an all-to-all round of messages) and then the contenders decide independently, so ties cause collisions. In the regime with a convention, a fixed priority order $N \succ E \succ S \succ W$ is known to all agents in advance, so the highest-priority contender simply goes and the rest wait, with no messages exchanged and no collision possible. We count collisions, coordination messages, and successful crossings over the same random arrivals for both regimes.

import random

DIRS = ["N", "E", "S", "W"]
PRIORITY = {"N": 0, "E": 1, "S": 2, "W": 3}   # the shared social law

def run(use_convention, ticks, seed):
    rng = random.Random(seed)
    collisions = messages = crossings = 0
    queues = {d: 0 for d in DIRS}             # agents waiting from each direction
    for _ in range(ticks):
        for d in DIRS:                        # random arrivals at the intersection
            if rng.random() < 0.5:
                queues[d] += 1
        contenders = [d for d in DIRS if queues[d] > 0]
        if not contenders:
            continue
        if use_convention:
            winner = min(contenders, key=lambda d: PRIORITY[d])   # rule decides
            queues[winner] -= 1                                   # zero messages
            crossings += 1
        else:
            n = len(contenders)
            messages += n * (n - 1)           # all-to-all claim messages this tick
            enterers = [d for d in contenders if rng.random() < 0.5]
            if len(enterers) >= 2:            # two or more enter at once -> crash
                collisions += 1
            elif len(enterers) == 1:
                queues[enterers[0]] -= 1
                crossings += 1
    return collisions, messages, crossings

TICKS, SEED = 2000, 7
c_no, m_no, x_no = run(False, TICKS, SEED)
c_yes, m_yes, x_yes = run(True, TICKS, SEED)
print("WITHOUT convention:", c_no, "collisions,", m_no, "messages,", x_no, "crossings")
print("WITH    convention:", c_yes, "collisions,", m_yes, "messages,", x_yes, "crossings")
print("removed:", c_no - c_yes, "collisions and", m_no - m_yes, "messages")

Code 27.7.1: A shared convention versus live negotiation at a gridworld intersection. The only difference between the two regimes is whether a priority law is agreed in advance; the random arrivals are identical because both runs use the same seed.

WITHOUT convention: 1412 collisions, 23994 messages, 481 crossings
WITH    convention: 0 collisions, 0 messages, 2000 crossings
removed: 1412 collisions and 23994 messages

Output 27.7.1: Over 2000 ticks of identical arrivals, the convention removes all 1412 collisions and all 23994 coordination messages while raising successful crossings from 481 to 2000. The shared rule converts a communication cost that grew with every encounter into a constant of zero.

The numbers are stark because the two regimes differ only in whether a rule was agreed in advance. Without the convention the agents pay 23994 messages and still crash on more than two thirds of the contested ticks; with it they pay nothing and never crash. That is the entire argument for conventions in one table of integers: a rule decided once removes both the communication and the conflict that a run-time negotiation would incur on every encounter. The catch, of course, is that a convention only helps for situations its designers anticipated, which is why real systems keep explicit coordination in reserve for the cases the rules did not cover.

Fun Note: The Meeting That Outlasted the Task

The classic failure of run-time coordination is the four-way stop where everyone is too polite. Each driver waves the others on, nobody moves, and the negotiation outlasts the crossing it was meant to arrange. Traffic engineers solved this not by making drivers better negotiators but by installing a convention: a fixed yield-to-the-right rule, or a light that decides for everyone. The agents got no smarter; the coordination simply moved out of the intersection and into the rulebook, which is exactly the move Output 27.7.1 rewards.

Library Shortcut: PettingZoo Gives You the Shared Environment for Free

In Code 27.7.1 we hand-rolled the intersection, the arrivals, and the turn order so the mechanism would be visible. To study coordination conventions in a reusable multi-agent environment, the PettingZoo library (the multi-agent counterpart of Gymnasium) supplies the shared world, the per-agent observation and action spaces, and the turn or step cycle, so you write only the policy and the convention:

# pip install pettingzoo
from pettingzoo.mpe import simple_spread_v3   # cooperative navigation, no collisions

env = simple_spread_v3.parallel_env(N=4)      # 4 agents share one continuous world
obs, info = env.reset(seed=7)
while env.agents:
    # A convention/policy maps each agent's observation to an action; the env
    # handles stepping all agents together and reporting collisions in the reward.
    actions = {a: policy(obs[a]) for a in env.agents}
    obs, rewards, term, trunc, info = env.step(actions)

Code 27.7.2: The same coordination study in a standard environment. About forty lines of hand-built intersection, arrival, and collision bookkeeping collapse to an env.step call; PettingZoo owns the shared state, the synchronized stepping, and the collision accounting, leaving you to design only the policy and the convention it follows.

5. Joint Intentions and Teamwork Advanced

Conventions and roles coordinate actions, but robust cooperation toward a shared goal needs something stronger: a theory of what it means for a group to be committed to a goal together. Joint-intentions theory supplies it. A team does not merely consist of agents that happen to share a goal; it consists of agents that hold a joint commitment to that goal, and that commitment carries an obligation. The defining clause is this: if any team member comes to believe the goal is achieved, or has become impossible, or is no longer wanted, it is obligated to make that belief known to the rest of the team. A private change of mind is not allowed to silently break the team; the agent must inform the others so the joint commitment can be revised together.

This informing obligation is what makes teamwork robust rather than brittle. Without it, one agent discovering that the goal is unreachable would simply stop, and its teammates would keep working toward a goal that can no longer be met, wasting effort and possibly causing harm. The obligation to communicate the change converts a collection of individually rational agents into a team that fails and recovers coherently. It is the cooperative analogue of the failure-signaling that makes a distributed computation robust: a worker that dies must be detected and its work reassigned, and here an agent whose belief about the goal changes must announce it so the team can re-plan. Teamwork theory is, in effect, a fault-tolerance protocol for shared intentions.

Practical Example: A Delivery-Drone Team That Knew When to Speak Up

Who: A robotics engineer building a coordinated delivery-drone fleet for a logistics startup.

Situation: Three drones were assigned to jointly deliver a heavy package that required all three to lift together to a rooftop drop point.

Problem: Early flights wasted battery and time when conditions changed mid-mission: one drone would detect that the rooftop was now unsafe and quietly return, leaving the other two hovering and waiting for a lift that would never complete.

Dilemma: Add a heavyweight central mission controller that polled every drone's status continuously, costing bandwidth and a single point of failure, or give each drone a lightweight obligation to broadcast only when its belief about the shared goal changed.

Decision: They adopted joint-intentions discipline: drones acted on a fixed role convention during normal flight and exchanged no status messages, but any drone that came to believe the joint goal was achieved, impossible, or abandoned was required to broadcast that belief immediately.

How: Each drone ran a local monitor; on a goal-belief change it sent one message to the team, which then jointly re-planned (abort, retry, or pick a new drop point) before any drone acted alone.

Result: Wasted hovering time fell sharply because a drone that found the rooftop unsafe told the others at once instead of silently leaving, and the fleet either recovered or aborted as a unit. Communication stayed near zero during nominal flight and spiked only at the moments that mattered.

Lesson: Robust cooperation is not constant chatter; it is silence under a shared convention punctuated by an obligation to speak the instant the shared goal's status changes.

6. Why Cooperation Has to Be in the Agents' Interest Advanced

Every mechanism so far assumed the agents want to cooperate. That assumption is safe when one designer owns all the agents and can simply build them to follow the conventions and honor the joint commitments. It is not safe when the agents are built by different parties with different objectives, because then an agent gains nothing by following a convention that its peers can exploit, and a rational self-interested agent will deviate the moment deviation pays. A convention is only stable if following it is in each agent's interest given that the others follow it too; otherwise the rule erodes as agents defect one by one. This is the bridge from coordination to game theory: the next chapter, Chapter 28, makes the notion of a stable convention precise as an equilibrium, a profile of behavior from which no agent benefits by unilaterally deviating, and shows how to design payoffs (mechanism design) so that the cooperative outcome is also the self-interested one.

The practical message is that coordination and incentives cannot be separated. A drive-on-the-right convention is self-enforcing because deviating gets you killed, so it needs no police; a tax-honesty convention is not self-enforcing, so it needs auditing. When you design a multi-agent system, you must ask not only what rule would coordinate the agents but whether each agent, reasoning for itself, would choose to keep it. We return to this question with the full apparatus of equilibria, cooperative games, and auctions in Chapter 28, and it underlies every negotiation and task-allocation protocol in the multi-agent chapters that follow.

Research Frontier: Coordinating Teams of LLM Agents (2024 to 2026)

The modern echo of joint intentions and social conventions is the orchestration of teams of large-language-model agents, where the same problems return in new clothes. Role and protocol frameworks such as AutoGen (Wu et al., 2023 to 2024), CrewAI, and LangGraph keep multi-agent LLM systems coherent by assigning explicit roles (planner, researcher, critic) and fixing the message protocol between them, which is organizational structure and convention applied to language agents rather than robots. A 2024 line of work on multi-agent failure analysis (for example the MAST taxonomy of Cemri and colleagues) finds that a large share of failures in LLM agent teams are coordination failures: agents that drop the shared goal without telling the others, or that talk past one another, the precise breakdown that joint-intentions theory was designed to prevent. The engineering frontier is to give LLM agent teams the informing obligations and lightweight conventions that decades of multi-agent research already formalized; Chapter 32 develops these orchestration patterns in depth. The cost models of this section apply directly, because every extra coordination message between LLM agents is a real token bill.

We now have the full toolkit for getting agents to act as one: organizational structure and conventions that pay the coordination cost at design time, partial global planning and explicit communication that pay it at run time, joint intentions that make a team robust to changing beliefs, and the game-theoretic check that the whole arrangement is stable. The next section asks what it means for these coordinating agents to know things together, building the notions of distributed knowledge and common belief that any informing obligation quietly assumes. That story begins in Section 27.8.

Exercise 27.7.1: Convention or Negotiation? Conceptual

For each scenario, decide whether coordination is better handled by a fixed convention or by run-time negotiation, and justify your choice in terms of the cost $C(n)$ from Section 3 and how predictable the situation is: (a) elevators in a tall building deciding which car answers a call; (b) two warehouse robots that meet in a narrow aisle they have never encountered together before; (c) packets choosing an output port in a network switch; (d) a rescue team of robots dividing an unmapped collapsed building. State explicitly what a convention would cost in flexibility and what a negotiation would cost in communication.

Exercise 27.7.2: Make the Convention Adaptive Coding

Modify Code 27.7.1 so the priority order is not fixed but rotates: after an agent crosses, its direction moves to the back of the priority order (a round-robin convention). Measure collisions, messages, and crossings, and compare fairness across the four directions to the fixed-priority run by reporting how many crossings each direction achieved. Does the rotating convention keep collisions and messages at zero? Explain why a convention can be made fair without reintroducing any communication, and what that implies for designing rules that are both cheap and equitable.

Exercise 27.7.3: Where Coordination Stops Paying Analysis

Take the cost model $T(n) = T_1/n + C(n)$ from Section 3 and suppose the coordination overhead grows linearly with the group, $C(n) = c \cdot n$ for a per-agent message cost $c$. Find the group size $n^\star$ that minimizes $T(n)$ by treating $n$ as continuous and setting the derivative to zero, and express $n^\star$ in terms of $T_1$ and $c$. Then explain, in the language of Section 3.5, how lowering $c$ (for instance by replacing a negotiation with a convention, as Output 27.7.1 does) shifts $n^\star$ and why this is the multi-agent statement of Amdahl's law.