Section 39.3: Distributed Task Allocation

"A task was auctioned to the agent who wanted it least reluctantly. I bid my distance, lost to a closer rotor, and discovered I had been free all along."
An Agent That Bid Too High

Big Picture

A team of robots facing a list of jobs has to answer one question over and over: who does what? When there is no central scheduler to answer it, the agents must answer it among themselves, bidding on tasks by local cost and resolving conflicts by agreement, so that the assignment they reach with only neighborly communication is close to the one an omniscient planner would have chosen. The previous section gave the swarm a way to move together; coordination of motion is worthless if two drones fly to the same target and none covers the third. This section turns the matching of agents to tasks into a distributed computation in its own right. We state the multi-robot task-allocation problem and its taxonomy, fix the centralized optimum as a yardstick with the Hungarian algorithm, then build the decentralized market and consensus mechanisms that approximate that optimum without ever assembling the full picture on one machine.

In Section 39.2 the robots learned to hold formation and avoid collisions, treating coordination as a problem of motion. But a swarm is deployed to accomplish work: survey these twelve regions, deliver to those eight addresses, inspect each of these turbines. The work arrives as a set of tasks, and the team's effectiveness is decided long before any motor turns, at the moment each task is matched to an agent. That matching is the subject here. On a single machine it is a textbook assignment problem with a clean optimal solution. The difficulty, and the reason it belongs in this book, is that the team has no single machine: each robot knows its own position and capabilities, sees only its neighbors, and must arrive at a coherent team-wide assignment through local negotiation. The allocation is itself distributed, and the central thread of this section is that local bidding plus agreement can approximate a global optimum that no agent could compute alone.

1. The Multi-Robot Task-Allocation Problem Beginner

Multi-robot task allocation (MRTA) asks how to assign a set of tasks to a team of robots so that some team objective, usually total cost or total reward, is optimized, subject to each robot's capabilities and the structure of the tasks. The problem is so pervasive that it has a standard taxonomy, due to Gerkey and Mataric, along three binary axes. The first axis is single-task versus multi-task robots: can a robot work on one task at a time, or several at once? The second is single-robot versus multi-robot tasks: does a task need exactly one robot, or a coalition? The third is instantaneous versus time-extended assignment: do we assign the tasks available right now, or plan a schedule of future tasks per robot? The simplest and most studied corner, single-task robots, single-robot tasks, instantaneous assignment, is exactly the assignment problem, and it is where we anchor the theory before relaxing toward the realistic time-extended case.

Each axis multiplies the difficulty. Instantaneous single-assignment is solvable optimally in polynomial time, as the next subsection shows. The time-extended case, where each robot is given an ordered bundle of tasks to perform in sequence, is NP-hard, because the cost of adding a task to a robot depends on the other tasks already in its bundle and on their order. This is the same combinatorial explosion that makes cluster job-packing hard in Chapter 33, and it is why the realistic algorithms later in this section are greedy heuristics with provable guarantees rather than exact solvers. Table 39.3.1 lays out the taxonomy and where each cell is treated.

Table 39.3.1: The Gerkey-Mataric taxonomy of multi-robot task allocation. Each axis is binary; the corners range from a polynomial-time assignment problem to an NP-hard scheduling problem.

Axis	Easy end	Hard end	Treated in
Robot capacity	Single-task (one job at a time)	Multi-task (several at once)	Section 2 (single)
Task demand	Single-robot (one agent suffices)	Multi-robot (needs a coalition)	Section 5 (coalitions)
Time horizon	Instantaneous (assign now)	Time-extended (schedule bundles)	Section 3 (CBBA bundles)

2. The Centralized Optimum as a Yardstick Intermediate

To judge any decentralized scheme we need the best possible answer to compare against, and for the simplest corner of the taxonomy that answer is computable exactly. Suppose there are $n$ agents and $n$ tasks, and let $c_{ij}$ be the cost for agent $i$ to perform task $j$, for instance the distance it must travel. An assignment is a permutation that pairs each agent with one task, encoded by binary variables $x_{ij} \in \{0, 1\}$ where $x_{ij} = 1$ means agent $i$ is assigned task $j$. The linear assignment problem minimizes total cost subject to each agent and each task being used exactly once,

$$\min_{x} \sum_{i=1}^{n} \sum_{j=1}^{n} c_{ij}\, x_{ij}, \qquad \sum_{j} x_{ij} = 1 \;\; \forall i, \qquad \sum_{i} x_{ij} = 1 \;\; \forall j, \qquad x_{ij} \in \{0,1\}.$$

Although the variables are binary, this problem has the remarkable property that its linear relaxation has integer optimal vertices, so it is solvable in polynomial time. The Hungarian algorithm of Kuhn and Munkres solves it in $O(n^3)$ time by manipulating the cost matrix until a zero-cost perfect matching appears in the reduced costs. It is the centralized optimum: a single process that holds the entire matrix $C = [c_{ij}]$ can compute the cost-minimizing assignment outright. The catch is precisely that it needs the entire matrix in one place. In our swarm, row $i$ lives on robot $i$ and no robot holds the whole matrix, so the Hungarian solution is a yardstick we measure against, not a procedure the team can run. It plays the same role here that the single-machine gradient played in Section 1.1: the exact centralized answer that the distributed method must approximate or match.

Key Insight: The Same Problem, With and Without a Center

Distributed task allocation and centralized cluster scheduling are the same mathematical object viewed from opposite sides. In Chapter 33 a single scheduler holds the full picture of jobs and machines and places work to optimize a global objective; the hard part is the combinatorics, not the information. Here every robot holds one row of the cost matrix and sees only its neighbors; the combinatorics is the same, but the binding constraint is that no agent can see the whole problem. Decentralization does not change what is optimal; it changes who is allowed to know it, and that is what forces the shift from solving to negotiating.

3. Decentralized Allocation by Auction Intermediate

Markets allocate scarce resources without a central planner, and a robot team can borrow the mechanism directly. In an auction-based allocation, each task is an item up for bid and each robot bids on the tasks it can do, using its own private cost. The bid that a self-interested agent should place is the marginal value of winning the task, the improvement to its own objective from adding that task to its current load. For a robot at position $p_i$ bidding on task $j$ at location $q_j$, the natural bid in the instantaneous single-task case is simply the negative travel cost, so that a closer robot bids higher,

$$b_{ij} = -\, c_{ij} = -\, \lVert p_i - q_j \rVert.$$

The auction runs in rounds. Every unassigned robot announces a bid on its currently most attractive free task; for each task, the highest bid wins and that robot-task pair is fixed; the losers return to the next round and bid again on what remains. This is the auction algorithm of Bertsekas, and it is genuinely distributed: a bid is a local computation on private cost, and the only thing that crosses the network is the small message "I bid $b$ on task $j$." It connects directly to the negotiation and auction protocols of Chapter 29, where the Contract Net protocol casts task allocation as announce, bid, and award between agents. Figure 39.3.1 traces one round: agents broadcast bids, the conflict over a contested task is resolved in favor of the lowest cost, and a conflict-free assignment emerges.

Figure 39.3.1: One auction round in decentralized task allocation. Agents broadcast bids equal to their negative cost (dashed orange); two agents contest task T1, and consensus resolves the conflict in favor of the lower-cost bidder (Agent 1). The outbid agent re-bids on a remaining task in the next round, and the process terminates in a conflict-free assignment (solid blue). No agent ever holds the full cost matrix.

For the time-extended corner of the taxonomy, where each robot is awarded an ordered bundle of tasks rather than a single one, the standard decentralized method is the consensus-based bundle algorithm (CBBA) of Choi, Brunet, and How. CBBA interleaves two phases that repeat until the team agrees. In the bundle-building phase, each robot greedily adds the task with the highest marginal score to its bundle, where the marginal score of inserting task $j$ accounts for where it best fits in the robot's existing route. In the consensus phase, robots exchange their winning-bid vectors with neighbors and apply deterministic conflict-resolution rules, so that whenever two robots claim the same task, the one with the higher bid keeps it and the loser releases it and any tasks it had added afterward. The marginal-score bid that robot $i$ places when inserting task $j$ at the best position in its current bundle $\mathcal{B}_i$ is

$$b_{ij} = \max_{k \le |\mathcal{B}_i|} \Big[ S_i\big(\mathcal{B}_i \oplus_k \{j\}\big) - S_i\big(\mathcal{B}_i\big) \Big],$$

where $S_i$ is robot $i$'s reward for completing its bundle and $\oplus_k$ inserts $j$ at position $k$ in the route. CBBA provably converges to a conflict-free assignment within a bounded number of communication rounds on a connected network, and the result is within a constant factor of the optimal team reward when the scoring function is submodular, meaning each added task helps less the more a robot already has. The consensus phase is doing exactly the conflict-resolution work that Chapter 2 formalizes: agents must agree on a single consistent value (who owns each task) despite each starting from its own local view.

Thesis Thread: Consensus, Returning as Conflict-Free Assignment

The agreement primitive introduced in Chapter 2, getting many machines to commit to one consistent value despite partial views, is the engine of CBBA. There the value was a committed log entry; here it is the owner of each task. The same scale-out pattern recurs throughout the book: a global property (a conflict-free assignment, an averaged gradient, a committed transaction) is reconstructed from purely local exchanges between neighbors. When you meet a new distributed-AI method, ask what global invariant it is trying to agree on and which consensus rule enforces it; in CBBA the invariant is "every task has exactly one owner" and the rule is "highest bid wins."

4. Response Thresholds: Allocation Without Bidding Intermediate

Auctions assume robots can exchange bids and run rounds of negotiation. A larger or more communication-starved swarm may not afford even that, and biology suggests a lighter mechanism. Social insects allocate labor with no negotiation at all, through response thresholds: each individual has a private threshold for each kind of task, and it engages a task only when the local stimulus for that task exceeds its threshold. As idle agents take on work, the stimulus falls; as work piles up, the stimulus rises until even high-threshold agents respond. This produces a stable division of labor with no messages exchanged, a mechanism developed in full in Chapter 31. The probability that an agent with threshold $\theta$ engages a task whose stimulus is $s$ follows the response-threshold function

$$P(\text{engage} \mid s, \theta) = \frac{s^2}{s^2 + \theta^2},$$

so an agent almost ignores tasks whose stimulus is well below its threshold and almost certainly engages those well above it. Heterogeneous thresholds across the team yield specialization for free: low-threshold agents become the first responders for a task type, high-threshold agents the reserves that activate only under heavy demand. The trade-off against auctions is sharp. Threshold allocation needs no communication and degrades gracefully as the swarm grows, but it cannot guarantee the near-optimal global cost that bidding achieves, because no agent ever compares its cost against another's. The right mechanism depends on the regime: auctions when bandwidth permits and optimality matters, thresholds when the swarm is too large or too silent to bid.

Practical Example: The Warehouse Robots That Stopped Crowding the Same Aisle

Who: A robotics platform engineer running a fleet of two hundred autonomous mobile robots in a fulfillment center.

Situation: Pick tasks streamed in continuously; the central dispatcher assigned each new task to the nearest free robot the instant it appeared.

Problem: Bursts of nearby orders sent a dozen robots converging on the same aisle while distant zones went unserved, and the central dispatcher became a throughput bottleneck and a single point of failure.

Dilemma: Keep the central dispatcher and scale it up to a bigger machine, or move to decentralized allocation where robots claim tasks themselves, cheaper and more robust but harder to reason about and to keep conflict-free.

Decision: They moved to a market-based scheme: each robot bid on incoming tasks by expected travel and queue cost, and a lightweight consensus over local clusters resolved contested picks.

How: Tasks were broadcast to robots within a zone; each bid its marginal cost; CBBA-style conflict resolution awarded each task to the highest bidder and released downstream claims when a robot lost.

Result: Aisle crowding fell sharply because a robot that lost a contested pick immediately re-bid elsewhere, throughput rose, and removing one robot no longer stalled the team, since allocation no longer flowed through a single dispatcher.

Lesson: When the dispatcher is the bottleneck and the failure point, the fix is not a bigger dispatcher but no dispatcher: let the agents allocate among themselves and pay only the cost of local bidding.

5. Dynamic Reallocation Under Failure and Arrival Advanced

A static assignment computed once is a fiction in the field. Robots fail, batteries deplete, new tasks appear, and obstacles render a planned task unreachable. Distributed allocation earns its complexity precisely because it reallocates gracefully under these disturbances. When a robot drops out, its tasks become unclaimed; in the auction and CBBA formulations this is handled by the same machinery that built the assignment, because a released task simply re-enters the market and the remaining robots bid on it in the next round. When a new task arrives, it is announced to the team and folded into the next bundle-building phase. The decentralized structure that made allocation possible also makes it self-healing: there is no central assignment table to repair, only a fresh round of local bidding and consensus. This is the multi-robot counterpart of the elastic, fault-tolerant training of Chapter 18, where a worker leaving or joining triggers a re-formation of the group rather than a crash.

The multi-robot-task corner of the taxonomy, where a task needs a coalition rather than a single robot, layers coalition formation on top of bidding: agents bid jointly, and the consensus phase must agree not only on who owns a task but on which group covers it. Even here the principle holds: the team approximates a global optimum it cannot compute centrally by exchanging local bids and resolving conflicts by agreement. The distributed RL infrastructure of Chapter 20 reappears in this case study's training pipeline when the bidding policies themselves are learned rather than hand-designed, closing the loop between the swarm's behavior and the methods of Part IV.

The demonstration below makes the central comparison concrete. It builds a small instance of six agents and six tasks with travel-distance costs, computes the centralized Hungarian optimum as the yardstick, then runs the decentralized auction in which each free agent bids on its cheapest free task and conflicts are resolved by lowest cost, and finally reports the optimality gap. The auction never assembles the full cost matrix in one decision; each agent acts on its own row alone.

import numpy as np
from scipy.optimize import linear_sum_assignment

rng = np.random.default_rng(7)
n = 6  # agents and tasks
agents = rng.uniform(0, 100, size=(n, 2))
tasks = rng.uniform(0, 100, size=(n, 2))

# Cost matrix: c[i, j] = euclidean distance from agent i to task j.
C = np.linalg.norm(agents[:, None, :] - tasks[None, :, :], axis=2)

# Centralized optimum: Hungarian algorithm minimizes total assigned cost.
row, col = linear_sum_assignment(C)
opt_cost = C[row, col].sum()

# Decentralized auction: each unassigned agent bids on its cheapest free task.
# Rounds run until every agent holds one task. No central scheduler picks the
# global optimum; agents act on local cost only, resolving conflicts by lowest bid.
free_agents = list(range(n))
free_tasks = set(range(n))
assign = {}
rounds = 0
while free_agents:
    rounds += 1
    bids = {}  # task -> (cost, agent), keep the lowest-cost claimant
    for a in free_agents:
        j = min(free_tasks, key=lambda t: C[a, t])
        if j not in bids or C[a, j] < bids[j][0]:
            bids[j] = (C[a, j], a)
    for j, (cost, a) in bids.items():
        assign[a] = j
        free_tasks.discard(j)
        free_agents.remove(a)

auction_cost = sum(C[a, j] for a, j in assign.items())
gap = (auction_cost - opt_cost) / opt_cost * 100

print(f"agents/tasks         : {n}")
print(f"hungarian optimum    : {opt_cost:.2f}")
print(f"auction total cost   : {auction_cost:.2f}")
print(f"consensus rounds     : {rounds}")
print(f"optimality gap       : {gap:.2f}%")
print(f"conflict-free        : {len(set(assign.values())) == n}")

Code 39.3.1: A decentralized distance-bid auction compared against the centralized Hungarian optimum. The Hungarian call sees the whole matrix; the auction loop lets each agent bid only on its own cheapest free task, and the lowest bid wins each contested task, exactly the conflict resolution of Figure 39.3.1.

agents/tasks         : 6
hungarian optimum    : 169.46
auction total cost   : 193.91
consensus rounds     : 2
optimality gap       : 14.43%
conflict-free        : True

Output 39.3.1: The decentralized auction reaches a conflict-free assignment in two rounds at a total cost 14.4 percent above the centralized optimum, having never gathered the full cost matrix on any one agent. The gap is the price of decentralization; the robustness and absence of a central scheduler are what that price buys.

Library Shortcut: scipy Solves the Centralized Optimum in One Call

The yardstick in Code 39.3.1 needs no hand-rolled Hungarian implementation. SciPy ships the cubic-time solver as a single call, so the centralized baseline that decentralized schemes are measured against is one line:

from scipy.optimize import linear_sum_assignment
row, col = linear_sum_assignment(C)        # optimal agent->task matching
optimal_total = C[row, col].sum()          # the cost no decentralized scheme can beat

Code 39.3.2: The Hungarian optimum in two lines. A from-scratch Kuhn-Munkres routine runs to dozens of lines and is easy to get wrong on the reduced-cost bookkeeping; linear_sum_assignment handles rectangular matrices and infeasible entries internally and returns the optimal matching directly.

6. When to Decentralize the Allocation Advanced

The optimality gap in Output 39.3.1 is not a defect to be apologized for; it is a deliberate trade. A central scheduler that gathers every cost and runs the Hungarian algorithm achieves the optimum, but it requires every robot to report to one place, makes that place a bottleneck and a single point of failure, and stops working the moment the network partitions. Decentralized allocation accepts a bounded loss of optimality in exchange for needing only neighbor-to-neighbor communication, surviving the loss of any agent, and scaling to teams far larger than a central solver could coordinate in real time. The decision mirrors the one in Chapter 33: a datacenter with reliable links and a trustworthy controller schedules centrally and reaps the optimum, while a field swarm with intermittent links and expendable members allocates locally and accepts the gap. The same problem, two answers, chosen by which constraint binds.

Research Frontier: Learned and Robust Decentralized Allocation (2024 to 2026)

Two threads are reshaping decentralized task allocation. The first replaces hand-designed bids with learned ones: graph neural networks trained over the team's communication graph output per-agent task scores that generalize across team sizes, and recent work casts multi-robot allocation as a decentralized policy learned by multi-agent reinforcement learning, building on the MARL methods of Chapter 30 so that the bid itself adapts to the task distribution. The second hardens the consensus phase against unreliable and adversarial conditions: variants of CBBA tolerate asynchronous and lossy communication, time-varying networks, and agents that bid dishonestly, connecting to the Byzantine-robust aggregation of Chapter 35. The open question both threads circle is the same one this section opened with: how close to the centralized optimum can a team get using only local information, and how gracefully does that bound degrade as the network frays?

Fun Note: The Auction Where Bidding High Means Wanting It Least

In a cost-minimizing robot auction the bid is the negative of the cost, so the agent that "wins" a task is the one for which the task is cheapest, the one with the least to lose by taking it. A drone that bids aggressively on a distant target is, in effect, volunteering for the worst commute on the team. The well-behaved swarm is the one where every agent ends up doing the thing it complained about least, which is a tidier outcome than most human committees manage.

We now have the matching of agents to work as a distributed computation: a taxonomy that locates the problem, a centralized optimum that bounds the best case, and market, consensus, and threshold mechanisms that approximate it with only local information while reallocating gracefully as the team changes. The team can coordinate motion and divide labor; both rest on agents exchanging headings and bids, and what remains is to confront the range-limited, lossy radio that those messages actually cross. That is the subject of Section 39.4.

Exercise 39.3.1: Place the System in the Taxonomy Conceptual

For each scenario, locate it in the three axes of Table 39.3.1 and state whether the instantaneous assignment problem of Section 2 applies or the NP-hard time-extended case does: (a) ten delivery drones, each carrying one package, dispatched to ten addresses right now; (b) five inspection robots, each able to visit a sequence of turbines over a shift, minimizing total route time; (c) a heavy crate that three robots must lift together while others continue picking. For each, name which mechanism from Sections 3 and 4 you would reach for first and why a single-task auction would or would not suffice.

Exercise 39.3.2: Measure the Gap as the Team Grows Coding

Extend Code 39.3.1 to sweep the team size $n$ over $\{4, 8, 16, 32, 64\}$, and for each size average the auction-versus-Hungarian optimality gap over many random instances. Plot the mean gap against $n$. Does the decentralized auction's gap grow, shrink, or stabilize as the team scales? Then modify the auction so each agent considers only tasks within a fixed communication radius rather than all free tasks, and report how restricting visibility changes both the gap and the number of rounds. Relate your finding to why a field swarm tolerates the gap that Output 39.3.1 reports.

Exercise 39.3.3: Cost the Central Bottleneck Analysis

A central scheduler must receive one cost row from each of $n$ robots, run the $O(n^3)$ Hungarian algorithm, and broadcast the assignment back. A decentralized auction instead runs $O(n)$ rounds, each exchanging $O(n)$ short bid messages among neighbors. Using the communication-cost model of Chapter 3, write the message and latency cost of each scheme as a function of $n$ and the per-message latency, and identify the team size beyond which the central scheme's gather-and-broadcast latency exceeds the auction's, assuming the central link is shared while neighbor links are parallel. Argue from your expressions, not intuition, why the optimality gap of the auction can still be the better deal at scale.