"I computed my optimal action assuming the other agents were furniture. They were not furniture. They had computed their optimal actions assuming I was furniture. We are now both worse off, and slightly embarrassed."
An Agent That Forgot the Others Were Also Optimizing
The moment agents have goals of their own, coordination becomes strategic: each agent's best action depends on what the others will do, and game theory is the mathematics of that mutual dependence, the toolkit that tells you what self-interested agents will actually do and when cooperation is a stable outcome rather than a hopeful assumption. Chapter 27 built the protocols by which agents act together and mostly took for granted that they want to. But an autonomous agent is autonomous precisely because it can have a goal that conflicts with its neighbors', and once that is on the table the clean question "how do these agents coordinate?" splits into a harder one: given that every agent is optimizing its own payoff while anticipating that everyone else is doing the same, what outcomes are stable, which of them are good for the group, and can the designer change the rules so the stable outcome is also the desirable one? This chapter answers that question with the standard apparatus of game theory, developed here for the multi-agent AI setting rather than for economics. It shows how to model an interaction as a game, what equilibrium means and why one always exists in the cases we care about, how value gets split when agents form coalitions, how to measure whether a self-interested outcome wastes welfare, how to design mechanisms and auctions whose rules make truth-telling optimal, and how repeated play and learning dynamics let cooperation emerge from agents that were never told to cooperate. This is the analytical core of Part VI: where Chapter 27 gave the protocols of coordination, this chapter gives the theory of whether agents will choose to coordinate at all, and every later chapter in the part formalizes and extends what is built here.
Chapter Overview
This is a focused foundation chapter, and its focus is the single idea that distinguishes a multi-agent system from a distributed one: the other agents are not part of the environment, they are decision-makers whose choices respond to yours. The seven sections develop the consequences of that idea in order. They begin by motivating why an agent with its own goals cannot reason as if it were alone, then build the formal language for writing down strategic interactions, then define what a stable outcome is and prove that one exists, then turn to the cases where agents can do better together, then ask whether the outcomes they reach are good for the group, then show how a designer can shape the rules to align private incentives with collective goals, and finally let the same game be played repeatedly so that learning and reputation can change everything.
The seven sections fall into three movements. The first lays the groundwork: Section 28.1 motivates why agents need game theory, and Section 28.2 builds the two canonical representations, normal form and extensive form. The second develops the central solution concepts: Section 28.3 defines Nash equilibrium and its refinements, Section 28.4 turns to cooperative games and how coalitions split their value, and Section 28.5 asks how the resulting outcomes measure up against social welfare and Pareto optimality. The third movement is about shaping and learning: Section 28.6 develops mechanism design and auctions, the engineering of games whose rules make desirable behavior self-interested, and Section 28.7 closes with repeated games and the learning dynamics through which cooperation can emerge over time.
Read in order, the seven sections take you from "an agent should just optimize its own payoff" to a working understanding of why that is impossible to do in isolation and what to do instead: model the interaction as a game, locate its equilibria, recognize when binding agreements let coalitions capture more value, measure the gap between selfish and socially optimal outcomes, redesign the rules so that gap closes, and understand how repetition and learning let agents reach cooperative outcomes that no single-shot analysis would predict. The argument is cumulative and it is the analytical spine of the rest of Part VI: the coordination protocols of Chapter 27 become situations to be analyzed strategically, and the equilibria, mechanisms, and learning dynamics built here become the foundations that the multi-agent systems of Chapter 29 and the multi-agent reinforcement learning of Chapter 30 formalize and put to work.
Prerequisites
This chapter assumes the distributed-AI foundations laid in the chapter just before it, and a little mathematical comfort beyond that. From Chapter 27: Distributed Artificial Intelligence you carry the central reframing this chapter builds on: that the unit of distribution has become the autonomous agent, an entity with its own information, its own goals, and the authority to act, and that getting such agents to act together is the core problem of the field. Chapter 27 mostly assumed those agents cooperate; here they may not, so the coordination protocols you met there become situations to be analyzed strategically rather than mechanisms to be applied. Beyond that, the chapter assumes basic probability: you should be comfortable with a probability distribution over a finite set of outcomes, an expectation, and the idea of a randomized choice, because mixed strategies, expected payoffs, and the existence of equilibrium all rest on them. Light formal notation and the ability to read pseudocode are assumed throughout, as in the rest of the book. No prior exposure to game theory, economics, or auction design is required; Section 28.1 motivates the whole apparatus from a multi-agent example, and every concept is developed from first principles before it is used. Readers who want to refresh the probability and optimization background can consult the math appendix in Appendix A: Math Background, which collects the distributions, expectations, and convex-optimization facts the formal arguments lean on.
Learning Objectives
- Explain why an agent with its own goals cannot reason as if the other agents were part of a fixed environment, and why their mutual best responses make coordination a strategic problem.
- Model a strategic interaction in both normal form and extensive form, and translate between a payoff matrix and a game tree with information sets.
- Define a Nash equilibrium in pure and mixed strategies, argue why one is guaranteed to exist in finite games, and apply refinements such as dominant strategies and subgame perfection.
- Analyze cooperative games with transferable utility, and use solution concepts such as the core and the Shapley value to divide a coalition's value fairly and stably.
- Measure an outcome against social welfare and Pareto optimality, and quantify the gap between selfish and socially optimal behavior using the price of anarchy.
- Design mechanisms and auctions whose rules make truthful behavior a dominant strategy, and reason about the Vickrey auction and the broader VCG family.
- Reason about repeated games and learning dynamics, and explain how strategies such as tit-for-tat and no-regret learning let cooperation emerge among self-interested agents over time.
If you keep one thing from this chapter, keep this: once agents have their own goals, each agent's best action depends on what the others choose, and game theory supplies the tools, games, equilibria, coalitions, welfare measures, mechanisms, and learning dynamics, that tell you what self-interested agents will actually do and when, and how, cooperation can be made a stable outcome rather than a hopeful assumption. Read forward, the sections build that toolkit in the order a multi-agent designer needs it: first why strategy is unavoidable and how to write a game down, then what a stable outcome is and whether it is any good, then how to redesign the rules so the stable outcome is also the desirable one, and finally how repetition and learning let cooperation grow on its own. Read as a question, the chapter asks of any system of more than one self-interested agent: why can no agent ignore the others, how do we represent the interaction, which strategy profiles are stable, can agents do better by binding together, is the outcome they reach good for the group, can the designer reshape the incentives, and what changes when the game is played again and again. The roadmap below walks the seven sections that answer it, and the last one carries the thread straight into the learning agents of the chapters ahead.
Chapter Roadmap
- 28.1 Why Agents Need Game Theory Motivates the strategic turn by showing that an agent with its own goals must reason about peers who are reasoning about it, so that optimizing in isolation is no longer well defined.
- 28.2 Normal-Form and Extensive-Form Games Builds the two canonical representations of a strategic interaction, the simultaneous-move payoff matrix and the sequential game tree with information sets, and shows how to translate between them.
- 28.3 Nash Equilibria and Solution Concepts Defines the Nash equilibrium in pure and mixed strategies, argues why one always exists in finite games, and develops refinements such as dominance and subgame perfection.
- 28.4 Cooperative Games and Coalitions Turns to games where agents can form binding coalitions, and uses the core and the Shapley value to divide the value a coalition creates stably and fairly.
- 28.5 Social Welfare and Pareto Optimality Asks whether the outcomes agents reach are good for the group, developing Pareto optimality, social welfare, and the price of anarchy that measures the cost of selfishness.
- 28.6 Mechanism Design and Auctions Inverts the analysis to engineer the rules of a game so that truthful, desirable behavior becomes each agent's dominant strategy, developing the Vickrey auction and the VCG family.
- 28.7 Repeated Games and Learning Dynamics Closes the chapter by playing the same game repeatedly, showing how strategies such as tit-for-tat and no-regret learning let cooperation emerge among agents that were never told to cooperate.
Read the seven sections in order and you will hold a working model of strategic interaction among self-interested agents: Section 28.1 motivates why agents need game theory, Section 28.2 builds the normal-form and extensive-form representations, Section 28.3 defines Nash equilibria and solution concepts, Section 28.4 develops cooperative games and coalitions, Section 28.5 measures outcomes against social welfare and Pareto optimality, Section 28.6 designs mechanisms and auctions, and Section 28.7 closes with repeated games and learning dynamics. The thread to watch is the shift from analysis to design and then to learning: first we model and solve a fixed game, then we reshape the rules so the equilibrium is the outcome we want, and finally we let agents play repeatedly and learn their way toward cooperation. That thread runs straight out of the chapter: the strategic foundations built here become the analytical backbone of the multi-agent systems of Chapter 29 and the learning agents of Chapter 30.
What's Next?
This chapter built the strategic toolkit for reasoning about agents with goals of their own: how to write an interaction down as a game, what makes an outcome stable, how coalitions split their value, how to measure the cost of selfishness, how to design rules that align incentives, and how repetition and learning let cooperation emerge. What it has deliberately not done is build the systems. Game theory tells you what rational agents should do in an interaction; it does not tell you how to architect a population of software agents that perceive, communicate, negotiate, and act in a shared environment over time. Chapter 29: Multi-Agent Systems supplies that. It develops agent architectures, communication languages, negotiation and coordination protocols, and organizational structures, turning the equilibria and mechanisms of this chapter into running systems of interacting agents. Where this chapter gave you the analysis of strategic interaction, the next gives you the engineering of agents that carry it out, and the chapter after that, on multi-agent reinforcement learning, lets those agents learn their strategies rather than be handed them. Read Chapter 29 next, and watch the abstract game become a concrete society of agents.
Bibliography & Further Reading
Foundations of Game Theory
von Neumann, J., Morgenstern, O. "Theory of Games and Economic Behavior." Princeton University Press, 1944. press.princeton.edu
The founding work that established game theory as a discipline and introduced the normal-form representation and expected-utility theory underlying Sections 28.2 and 28.3.
Nash, J. F. "Equilibrium Points in N-Person Games." Proceedings of the National Academy of Sciences, 36(1), 1950. pnas.org
The two-page paper that proved every finite game has an equilibrium in mixed strategies, the existence result at the heart of Section 28.3.
Daskalakis, C., Goldberg, P. W., Papadimitriou, C. H. "The Complexity of Computing a Nash Equilibrium." SIAM Journal on Computing, 39(1), 2009. dl.acm.org
The result showing that computing a Nash equilibrium is PPAD-complete, the complexity caveat behind the existence guarantee discussed in Section 28.3.
Cooperative Games and Mechanism Design
Shapley, L. S. "A Value for n-Person Games." In Contributions to the Theory of Games II, Annals of Mathematics Studies 28, Princeton University Press, 1953. degruyter.com
The paper that introduced the Shapley value, the axiomatic fair division of a coalition's worth that anchors Section 28.4.
Vickrey, W. "Counterspeculation, Auctions, and Competitive Sealed Tenders." The Journal of Finance, 16(1), 1961. jstor.org
The work that introduced the second-price sealed-bid auction in which truthful bidding is dominant, the foundation of the mechanism-design material in Section 28.6.
Roughgarden, T., Tardos, É. "How Bad Is Selfish Routing?" Journal of the ACM, 49(2), 2002. dl.acm.org
The paper that bounded the price of anarchy for network routing, the quantitative cost-of-selfishness result developed in Section 28.5.
Repeated Play and Learning
Axelrod, R. "The Evolution of Cooperation." Basic Books, 1984. basicbooks.com
The study of repeated prisoner's-dilemma tournaments that showed how tit-for-tat lets cooperation emerge among self-interested agents, the motivating narrative of Section 28.7.
Textbooks and Surveys
Nisan, N., Roughgarden, T., Tardos, É., Vazirani, V. V. (eds.). "Algorithmic Game Theory." Cambridge University Press, 2007. cambridge.org
The standard reference connecting game theory to computation, covering equilibrium computation, mechanism design, and the price of anarchy across Sections 28.3, 28.5, and 28.6.
Shoham, Y., Leyton-Brown, K. "Multiagent Systems: Algorithmic, Game-Theoretic, and Logical Foundations." Cambridge University Press, 2009. masfoundations.org
The graduate text that develops game theory specifically for multi-agent systems, freely available online, the general companion reference for the whole chapter.
Software and Modern Applications
Lanctot, M., et al. "OpenSpiel: A Framework for Reinforcement Learning in Games." arXiv:1908.09453, 2019. arxiv.org
The open-source library of games and game-theoretic algorithms, the practical toolkit for experimenting with the equilibria and learning dynamics of this chapter.
Brown, N., Sandholm, T. "Superhuman AI for Multiplayer Poker." Science, 365(6456), 2019. science.org
The Pluribus result that reached superhuman play in six-player poker, a landmark application of equilibrium approximation in a large imperfect-information game.
Meta FAIR Diplomacy Team (Bakhtin, A., et al.). "Human-Level Play in the Game of Diplomacy by Combining Language Models with Strategic Reasoning." Science, 378(6624), 2022. science.org
The Cicero result combining a language model with game-theoretic planning to negotiate and cooperate with humans, a preview of the strategic agents Part VI builds toward.