Pattern Catalog

29 patterns for managing what goes into the context window. Each one names a failure mode, a structural fix, and the evidence behind it.

The Problem

Context Rot

Model quality degrades as context gets longer, even well within the window limit. 11 of 13 models drop to half their baseline at 32k tokens. Every pattern below exists because of this.

NoLiMa Benchmark ·Lost in the Middle

Core Patterns

The six patterns from the learning path. Start here.

The Pyramid

Start with general background, progressively add specific details. Give the model altitude before asking it to land. Mirrors how experts brief each other; context first, task second.

Select, Don't Dump

Include the smallest set of high-signal tokens that helps the model do the task. More context usually means weaker attention.

Compress & Restart

When conversations grow long, summarize what matters and start fresh. Context quality degrades well before hitting advertised limits.

Write Outside the Window

Persist important context to external storage: scratchpads, memory files, knowledge bases. The context window is working memory, not long-term memory.

Grounding

Retrieval gets information into context. Grounding makes the model actually use it. Without explicit anchoring instructions, the model will often ignore what you retrieved and fall back to whatever it absorbed during training.

Anchor Turn

Front-load all source reads into one turn so every subsequent turn works from cache.

All Patterns

Advanced and specialized patterns beyond the core set.

Isolate

Give sub-agents their own focused contexts instead of sharing one massive window. Anthropic's multi-agent system uses 15x more tokens total but gets better results, because each agent sees only what it needs.

Recursive Delegation

Let agents spawn child agents with scoped sub-contexts. Instead of stuffing everything into one window, the parent splits work, delegates with focused context, and aggregates results.

Progressive Disclosure

Start with an index of available context. Let the model pull in details on demand instead of loading everything upfront.

Schema Steering

A JSON schema tells the model what to think about, in what order, and with what vocabulary. Define the structure and the model's reasoning follows.

Context Caching

Reuse computed context across requests to reduce costs and latency. Structure prompts so the stable prefix gets cached and only the variable part changes.

Attention Anchoring

Place critical information at the start and end of context. Models over-attend to the beginning and end of their context window, a phenomenon called 'lost in the middle.' Work with this bias instead of against it.

Temporal Decay

Weight recent context higher and systematically age out old information. Old messages can stay available, but they should compete less with the current task.

Tool Descriptions as Context

Tool definitions are context. The description tells the model when to use a tool and how. Most descriptions only say what the tool does; the ones that work also say when to use it and when not to.

Few-Shot Selection

Choose examples that resemble the current input, even when the easiest examples to find look cleaner. The wrong examples teach the model the wrong behavior.

Context Budget

Treat the context window as a finite resource with planned allocations. Decide upfront how many tokens each section gets, then enforce it.

Role Framing

Defining a role in the system prompt does more than set a tone. It activates a vocabulary, constrains scope, and steers which heuristics the model applies. The specificity of the role determines how much of that steering actually lands.

Multi-Modal Context

Images consume tokens aggressively and at unpredictable rates. Choose the right modality for each piece of context (raw image, text description, or structured extraction) before the model sees it.

Negative Constraints

"Don't do X" is weaker than it looks. Negative instructions activate attention on the prohibited thing and leave the model without a path forward. Reserve them for hard stops; use positive framing everywhere else.

Context Handoff

When one agent passes work to another, most of the context gets lost. The handoff boundary is where multi-agent systems silently degrade, because nobody designed what travels with the task.

Context Poisoning

A hallucination in the context window becomes ground truth for every subsequent turn. The model generated it, so it trusts it, and the error compounds silently until the output is confidently wrong about something that was never true.

Retrieval as Context Curation

Retrieval isn't just search. Every retrieval decision is a context engineering decision: what to retrieve, how much, in what order, and what to leave out. The vector store returns candidates; you decide what earns a place in the window.

Instruction Hierarchy

Not all context is created equal. System instructions, user messages, retrieved documents, and tool outputs compete for the model's attention, and without explicit priority signals the model resolves conflicts unpredictably.

Scratchpad

Maintain structured working state inside the context window: a running plan, a list of findings, a set of decisions made so far. Without an explicit scratchpad, the model reconstructs its state from raw conversation history on every turn, and gets worse at it as the conversation grows.

Retrieval Subagent

Split context retrieval into a focused agent that returns exact evidence. The main agent should receive selected files, ranges, and facts after the search noise has been discarded.

Validate Compression

Treat every summary as a risky rewrite. Validate compressed context against the next task before you trust it.

Causal Memory Selection

Select memories by their measured effect on the answer. A memory belongs in context only if it improves the next step.

State Sanitization

Clean unsafe or adversarial state before it enters memory, summaries, or handoffs. Sanitizing only the final summary is too late.

Trace the Work

Persist the agent's evidence trail alongside the artifact it changed. Future agents need the reasoning path and the final diff together.