Front Matter
Mental Models

Mental Models: A Visual Glossary

The book's recurring "aha" pictures, gathered as one illustrated reference you can return to from any chapter.

"They taught me fourteen pictures and then forty-one chapters. I confess I navigated the whole book by the pictures, and only consulted the chapters when a picture stopped being enough."

A Worker That Thinks in Diagrams
Big Picture

A book that travels from the MapReduce shuffle to mixture-of-experts routing to multi-agent swarms keeps returning to the same small set of pictures, and this page collects those pictures in one place. Each of the fourteen mental models below is a single image that compresses one of the book's core "aha" moments: the six axes along which a system can be distributed, the ring that makes gradient synchronization bandwidth-optimal, the funnel that powers search and RAG alike, the masks that cancel so a server learns only a sum. Every illustration here also appears inline in the section that first teaches the concept, with the full prose, math, and code around it; this gallery is the map, and the linked section is the territory. Skim it once now to meet the recurring shapes of the book, and return to it whenever a later chapter reuses a concept and you want the one-picture reminder of where it came from.

The fourteen concepts below are not a random sample; they are the ideas that recur across parts and carry the book's connective weight. Several reappear scaled out or specialized later: the ring all-reduce of Chapter 4 is the engine under data-parallel training, the MapReduce shuffle of Chapter 6 returns as the all-to-all of expert routing, and the parameter-server push-pull of Chapter 11 reappears as the sharded gather-and-scatter of ZeRO and FSDP. The cards are ordered to follow the book's arc, foundations first, then data, training, inference, and finally multi-agent and privacy. Each card names the chapter and section where the picture is first developed in full, so the gallery doubles as an index of first appearances.

What's Next

These fourteen pictures are the shorthand; the chapters are the full account. The natural place to begin is the first one, where the six axes of this gallery are derived rather than assumed. Chapter 1 opens the book proper by showing one machine stop being enough and watching many workers compute the identical gradient a single machine would have, then turns that one calculation into the six axes of distribution that organize everything to follow. Read it next, and return to this gallery whenever a later chapter reuses a shape you first met here.