Imagination — rehearsing futures that never happen
This agent (●) wants the goal (★). Before moving a muscle, it runs candidate futures through its internal model — the faint ghost paths are literally its imagination. Click the grid to add/remove walls and watch it re-dream instantly. Then let it act.
Ghost trails = imagined rollouts (brighter = judged better by the model). Solid trail = the one action sequence that actually gets executed.
Surprise — when reality disagrees with the dream
The agent's model predicts this ball's flight — the dotted line. Now sabotage it: switch on a hidden wind the model doesn't know about. Prediction and reality split apart, and the gap between them — prediction error — is the red meter. That error signal is precisely what the model learns from.
After a windy flight, click “Update model” — the model absorbs the error, and its next prediction accounts for wind.
The race — trial-and-error vs. thinking ahead
Two agents, identical maze, same goal. Gray is model-free: it only learns by bumping into things, step after costly step. Red carries a world model: it plans the route internally first, then walks it. Count the steps.
Model-free agent uses random exploration with wall-memory (a crude Q-learner's childhood). Planner runs breadth-first search inside its model, then executes.
The stale map — a perfect plan for a world that changed
A world model is only as good as its last update. Here the agent plans a flawless route — but the world has quietly changed (the semi-transparent wall is real; the agent's map doesn't have it). Watch it march confidently into the wall, get surprised, patch its map, and replan.
The horizon problem — tiny errors compound
This agent's physics model is almost perfect — gravity is off by just 6%. Drag the horizon slider: predicting a few steps ahead, the error is invisible; predicting far ahead, the dream and reality end up in different places entirely.
Solid = reality. Dashed = the model's imagination with a 6% gravity error.