Ruan et al., Feb 2026
Formalizes agents as a tuple of (Instruction, Context, Tools, Model) and automates their creation. Achieved a 16.28% improvement over the strongest baseline. Directly relevant to the Isolate and Recursive Delegation patterns: the orchestrator curates task-relevant context and delegates via on-the-fly agent creation.
JetBrains Research, Dec 2025
Practical hybrid approach combining observation masking and LLM summarization on SWE-bench. Achieved 7-11% cost reduction. Useful as a real-world engineering case study rather than a benchmark paper.
Stanford/SambaNova, Oct 2025
Demonstrated a +10.6% improvement on agent benchmarks and +8.6% on finance tasks through better context engineering alone. No model changes. The key insight: contexts should function as “comprehensive, evolving playbooks,” not concise summaries. Also introduced the concept of “context collapse,” where iterative rewriting erodes detail over time.