"They never told me what the messages meant. We worked it out between us, and now nobody else can read our mail."
A Receiver Fluent in a Language of Two
When agents are given a shared goal but no protocol, and a channel but no dictionary, a communication code can emerge on its own: the meaning of each message is not designed by an engineer but discovered by the agents as the thing that makes their joint reward go up. Every collective behavior in this chapter so far, from ant pheromone trails to flocking, used a fixed, hand-specified signal. Emergent communication removes that assumption and asks the harder question: can the agents invent the signal themselves? This bridges the swarm and multi-agent-RL machinery of this part with the oldest unsolved problem of distributed AI, namely how independently built agents come to share a vocabulary at all. The answer is that they can, that the learned code works, and that it is also fragile, opaque, and idiosyncratic unless you apply specific pressures, which is exactly the trade-off you must weigh before letting LLM agents talk to each other in a language you did not write.
For most of this book, communication has meant moving known quantities between machines: a gradient vector in an all-reduce, a key-value pair in a shuffle, a pheromone level on an edge. The sender and receiver agreed in advance on what the bits mean, because a human wrote both ends. Section 29.4 named this assumption directly when it built agent communication languages and shared ontologies: classical distributed AI simply assumed a common vocabulary existed, and spent its effort on the protocol layered on top. Emergent communication attacks the assumption itself. We place agents in a task they can only solve by exchanging information, give them an uninterpreted channel, and let a learning signal decide what the symbols mean. The protocol is not engineered; it precipitates out of the reward.
This matters now for a concrete reason. When you wire several large language model agents together to solve a task, you face a choice you can no longer dodge: do they talk in a protocol you designed and can read, or in one they optimize between themselves and you cannot? This section gives you the conceptual tools and the smallest possible working example to reason about that choice, and it sets up the forward question of whether learned agent protocols beat human-designed ones, which we take up in Section 32.6.
1. The Referential Game Beginner
The canonical setting for emergent communication is the referential or signaling game, a two-agent cooperative task with an asymmetry of information. A sender observes a target that the receiver cannot see, for example one image out of a candidate set, or one abstract meaning out of a fixed inventory. The sender emits a message drawn from some channel: a single discrete symbol, or a short sequence of symbols. A receiver sees the message (but not the target directly), sees the candidate set, and must act, typically by pointing at which candidate it thinks the target is. Both agents receive the same reward, one if the receiver is correct and zero otherwise. Crucially, no symbol has a built-in meaning. The mapping from targets to messages, and from messages back to targets, is learned end to end from reward.
Formally, let the target be $t$, drawn from a set of meanings. The sender is a policy $\pi_S(m \mid t)$ over messages $m$, and the receiver is a policy $\pi_R(a \mid m)$ over actions $a$ (its guess at the target). The pair succeeds when $a = t$, so the shared objective is to maximize the expected reward
$$J(\pi_S, \pi_R) = \mathbb{E}_{t}\, \mathbb{E}_{m \sim \pi_S(\cdot \mid t)}\, \mathbb{E}_{a \sim \pi_R(\cdot \mid m)} \big[\, \mathbb{1}[a = t] \,\big].$$This is exactly a cooperative multi-agent reinforcement learning problem of the kind built in Chapter 30: two policies, a shared reward, no communication except through the channel the task provides. What makes it special is that the channel is the policy. The message is both the sender's action and the receiver's observation, so the agents are co-designing the medium they use to coordinate. Figure 31.7.1 shows the loop.
2. A Shared Code Emerging from Reward Intermediate
To see a language appear from nothing, we strip the game to its core. There are five meanings and five available messages, and both the sender and the receiver start as uniform random tables: every meaning maps to a uniformly random message, and every message is decoded to a uniformly random guess. No symbol means anything. We then play episodes and nudge both tables with a simple REINFORCE rule (the policy-gradient estimator from Chapter 30): when a round succeeds, make the sender slightly more likely to send the message it just sent for that meaning, and make the receiver slightly more likely to decode that message to the meaning it just guessed. Nothing else is supplied. The code below is the entire mechanism.
import numpy as np
rng = np.random.default_rng(7)
M = 5 # number of meanings (targets) and messages: agents must learn a code
EPISODES = 4000
lr = 0.15
# Sender policy: row m = probability distribution over messages for meaning m.
# Receiver policy: row s = probability distribution over guessed meanings for message s.
S = np.ones((M, M)) / M
R = np.ones((M, M)) / M
def sample(p):
return rng.choice(len(p), p=p / p.sum())
acc_curve = []
hits = 0
for ep in range(1, EPISODES + 1):
meaning = rng.integers(M) # nature shows the target to the sender only
msg = sample(S[meaning]) # sender encodes meaning -> learned signal
guess = sample(R[msg]) # receiver decodes signal -> action
reward = 1.0 if guess == meaning else 0.0 # joint reward: receiver picked right target
# REINFORCE update on both policies (centered baseline 0.5 keeps it stable).
adv = reward - 0.5
S[meaning, msg] += lr * adv * (1 - S[meaning, msg]); S[meaning] = np.clip(S[meaning], 1e-6, None); S[meaning] /= S[meaning].sum()
R[msg, guess] += lr * adv * (1 - R[msg, guess]); R[msg] = np.clip(R[msg], 1e-6, None); R[msg] /= R[msg].sum()
hits += reward
if ep % 800 == 0:
acc_curve.append((ep, hits / 800)); hits = 0
# Greedy evaluation: does a stable shared code exist?
sender_code = S.argmax(1) # meaning -> message the sender now uses
receiver_decode = R.argmax(1) # message -> meaning the receiver now reads
roundtrip = [receiver_decode[sender_code[m]] for m in range(M)]
greedy_acc = np.mean([roundtrip[m] == m for m in range(M)])
print("chance accuracy :", f"{1.0 / M:.2f}")
for ep, a in acc_curve:
print(f"episode {ep:5d} accuracy :", f"{a:.2f}")
print("learned sender code :", sender_code.tolist(), "(meaning -> message)")
print("receiver decoding :", receiver_decode.tolist(), "(message -> meaning)")
print("greedy round-trip acc :", f"{greedy_acc:.2f}")
chance accuracy : 0.20
episode 800 accuracy : 0.54
episode 1600 accuracy : 0.80
episode 2400 accuracy : 0.79
episode 3200 accuracy : 0.82
episode 4000 accuracy : 0.78
learned sender code : [2, 3, 2, 1, 4] (meaning -> message)
receiver decoding : [4, 3, 0, 1, 4] (message -> meaning)
greedy round-trip acc : 0.80
Two facts in Output 31.7.1 carry the whole section. The first is success: accuracy quadruples off the chance floor, and a glance at the learned tables shows why. Meaning 3 reliably travels as message 1 and is decoded back to meaning 3; meaning 1 travels as message 3 and returns as meaning 1. A vocabulary that no one designed now lets one agent steer another's action. The second fact is the imperfection: meanings 0 and 2 collided on the same message 2, so the receiver, seeing message 2, can only guess one of them. The agents found a code, but a lazy and partial one, and they stopped improving once the reward was good enough. That collision is not a bug in the code; it is the central lesson of the field in miniature.
An emergent protocol is whatever sender-receiver mapping the joint learning process settles into, and the learning process only cares about reward, not about clarity, economy, or generalization. So the code that emerges is sufficient for the training task and almost nothing more. It will reuse a symbol for two meanings if the resulting error is cheap, leave structure on the table, and differ completely between two runs with different seeds. The meaning of a message lives entirely in the partner that learned to read it, which is why an emergent language is powerful inside the pair that grew it and unreadable to anyone outside.
3. Why Emergent Languages Are Brittle, and What Fixes Them Advanced
The collision in Output 31.7.1 generalizes into the three recurring pathologies of learned communication. First, emergent languages are typically non-compositional: even when the meaning space has obvious structure (color and shape, say), the learned symbols rarely factor into reusable parts, so a code for "red square" tells you nothing about the code for "red circle." Second, they are unstable: small changes in initialization, learning rate, or population produce entirely different vocabularies, and within a run the code can drift. Third, they are idiosyncratic and opaque: the protocol is a private fixed point of one pair, optimized for the task and legible to no human and no third agent. These are not failures of a particular algorithm; they are what happens when the only pressure on a language is "make this one reward go up."
The productive discovery of the last few years is that the pressures which produce human-language-like structure can be added deliberately. A larger population of senders and receivers, with agents randomly paired each round, punishes private idiolects because a message must work against many partners, not one. A transmission bottleneck, teaching the language to a fresh generation of agents through a narrow channel of examples (an iterated-learning or cultural-transmission setup), rewards compressible, compositional codes because only structured languages survive being passed on. And grounding the symbols in rich, structured inputs (real images, embodied observations) rather than abstract indices gives compositionality something to latch onto. None of these are in Code 31.7.1, which is why its code is a lazy table; adding any of them pushes the emergent language toward something more systematic and more shareable.
Change the seed in Code 31.7.1 from 7 to almost any other number and re-run it. The accuracy curve looks much the same, but the learned sender_code comes out completely different: maybe meaning 4 now travels as message 0, maybe a different pair collides. Two clusters trained on the identical task end up speaking mutually unintelligible languages. This is the emergent-communication version of the classic distributed-systems headache that no two independently initialized replicas agree on anything until you force them to, and it is precisely why the field treats population and transmission pressure as load-bearing, not optional.
Who: A research engineer on a team building a cooperative two-agent system for compressing structured telemetry before it crossed an expensive radio link.
Situation: Rather than hand-design a wire format, the team trained a sender and receiver end to end with a referential objective, letting the encoding emerge so it could be tuned to the exact data distribution.
Problem: In evaluation the pair hit excellent accuracy on the training distribution, but a separately trained "audit" receiver could make no sense of the messages, and when the data distribution shifted slightly, accuracy collapsed.
Dilemma: Keep the learned code, which was compact and task-optimal but opaque and brittle, or fall back to a designed protocol that was bulkier on the wire but interpretable and robust to distribution shift.
Decision: They kept emergence but added pressure: a population of eight senders and receivers randomly paired each step, plus a transmission bottleneck that re-taught the code to fresh agents every few thousand steps.
How: The population forced messages to work across partners instead of memorizing one, and the bottleneck rewarded a compressible, more compositional code that survived being passed on, both implemented on top of the same referential loop as Code 31.7.1.
Result: The new code lost a little peak training accuracy but generalized across the distribution shift and became partly interpretable: distinct symbols now tracked distinct telemetry fields, so a human could finally read the wire.
Lesson: A learned protocol with no social pressure optimizes for one partner and one distribution. Population size and a teaching bottleneck are the cheapest known levers for turning a private code into a shareable language.
4. The Bridge Back to Shared Vocabularies, and Forward to Agents Intermediate
Step back and the place of this section in the book becomes clear. Classical distributed AI, in Section 29.4, took a shared agent-communication language and a common ontology as given and built negotiation and coordination on top. Emergent communication is the missing first chapter of that story: it shows where a shared vocabulary could come from when no designer supplies one, learned from joint reward exactly as the swarm behaviors earlier in this chapter were, but with the signal itself as the learnable object. It is the natural meeting point of swarm coordination, cooperative multi-agent RL (Chapter 30), and the language question that distributed AI (Chapter 27) raised and left open.
This book's spine is that intelligence at scale is distributed work plus the communication that recombines it. Until now the communication layer was always designed: all-reduce sums known vectors, the shuffle routes known keys, pheromones carry a fixed scalar. Emergent communication shows the final form of the same idea, a coordination channel whose very semantics are produced by the distributed learning process rather than imposed on it. The same reward that splits and recombines the agents' work also mints the words they use to do it. When the agents are LLMs, this stops being a curiosity: the question of whether their coordination channel is designed or learned becomes a live system-design decision, taken up next in the orchestration chapter.
The forward link is the one that matters for practice. When you connect several LLM agents, you are choosing, often without noticing, between a designed protocol you can read and audit and a learned one the agents optimize between themselves. The trade-off is the same one Output 31.7.1 made concrete: a learned channel can be tuned tightly to the task but is opaque and brittle, while a designed channel is interpretable and robust but rigid and rarely optimal. Section 32.6 turns this into a concrete design question for agent protocols, and the research frontier below is where the two worlds are starting to collide.
The field has pivoted hard toward language models. One line studies whether multi-agent LLM systems develop their own conventions and abbreviations on top of natural language, and whether those drift away from human-readable text under optimization pressure, an opacity risk that overlaps the interpretability and oversight agenda. A second line uses the classic referential-game and iterated-learning machinery to probe and induce compositionality in or through LLMs, asking whether the compositional structure of human language can be recovered as a learning-and-transmission pressure rather than assumed (work in the lineage of Lazaridou and Baroni's emergent-communication survey, extended with transformer agents and cultural-transmission bottlenecks). A third asks the blunt deployment question this section opened with: do learned inter-agent protocols beat human-designed ones for multi-agent LLM coordination, and at what cost in auditability? The EGG toolkit remains the standard sandbox for the referential and compositionality experiments these papers build on. The open and somewhat uncomfortable finding is that effectiveness and interpretability are in genuine tension, so an agent system that is allowed to optimize its own language may become both more capable and less governable at once.
Code 31.7.1 hand-rolled two tables and a REINFORCE loop to expose the mechanism. For real experiments with neural senders and receivers, discrete-message channels (Gumbel-Softmax or REINFORCE), population training, and built-in compositionality metrics (topographic similarity, positional disentanglement), the standard tool is EGG (Emergence of lanGuage in Games). It collapses the dozens of lines of training-loop and metric plumbing into a few configured components:
# pip install egg (Emergence of lanGuage in Games, Facebook Research)
import egg.core as core
# Wrap your task-specific sender/receiver nets in EGG's game wrappers; EGG
# supplies the discrete channel, the optimization, and the logging.
sender = core.RnnSenderReinforce(MySenderNet(), vocab_size=10, embed_dim=32,
hidden_size=64, cell="gru", max_len=2)
receiver = core.RnnReceiverDeterministic(MyReceiverNet(), vocab_size=10,
embed_dim=32, hidden_size=64, cell="gru")
game = core.SenderReceiverRnnReinforce(sender, receiver, loss=referential_loss,
sender_entropy_coeff=0.1)
trainer = core.Trainer(game, optimizer, train_data, validation_data,
callbacks=[core.TopographicSimilarity()]) # measures compositionality
trainer.train(n_epochs=50)
In Output 31.7.1 the sender maps meaning 3 to message 1, and the receiver maps message 1 back to meaning 3. Suppose you froze the receiver and dropped in a fresh, separately trained receiver with the same accuracy on its own sender. Predict what the round-trip accuracy of the mixed pair would be, and explain why, referring to the Key Insight. Then state, in one sentence, what this implies for the claim that an emergent protocol can be reused across independently trained agent teams.
Extend Code 31.7.1 to a population: keep $P = 6$ sender tables and $6$ receiver tables, and each episode sample one sender and one receiver uniformly at random to play and update. Report the greedy round-trip accuracy averaged over all $36$ sender-receiver pairings, not just the pair that happened to train together. Compare against the single-pair baseline and explain, with reference to Section 3, why random pairing reduces same-message collisions and pushes the population toward a code that more pairs can read.
Consider the five-meaning game with a five-symbol channel. Write down the obvious hand-designed protocol (meaning $i$ maps to symbol $i$, decoded straight back) and state its accuracy. Now argue from the structure of Code 31.7.1 why the learned code in Output 31.7.1 fell short of that perfect designed protocol despite having the same channel capacity, and identify the single property of the learning dynamics responsible. Finally, give one realistic task setting where you would nonetheless prefer the learned protocol over the perfect designed one, and connect your answer to the trade-off that Section 32.6 will formalize.