The emerging discipline of designing what information AI systems work with - and how it's structured.
"Context engineering is the delicate art and science of filling the context window with just the right information for the next step."
Start with general background, progressively add specific details. Give the model altitude before asking it to land. Mirrors how experts brief each other - context first, task second.
The smallest set of high-signal tokens that maximize the desired outcome. 11 of 12 models drop below 50% performance at 32k tokens. More context isn't better context. Surgical selection beats comprehensive inclusion.
Give sub-agents their own focused contexts instead of sharing one massive window. Anthropic's multi-agent system uses 15x more tokens total but gets better results - because each agent sees only what it needs.
When conversations grow long, summarize what matters and start fresh. Context quality degrades past 85% window capacity. Compaction beats continuation. Treat context like a cache, not a log.
Persist important context to external storage - scratchpads, memory files, knowledge bases. The context window is working memory, not long-term memory. What matters should survive beyond a single session.
Don't pre-load everything. Maintain lightweight identifiers and load full data on demand at runtime. Let the model discover what it needs through exploration, like a developer navigating a codebase.
See what your AI actually sees. A framework-agnostic proxy that intercepts LLM API calls and visualizes context window composition in real time. Works with Claude, GPT, and any tool that calls an LLM API.
This space is being built. Context engineering is the skill that determines whether AI systems work in production. Follow the work.