Recent agentic systems (Claude Code, Codex, RLM, etc.) push context out of the prompt and into the environment (e.g., as files). This helps them maintain long-term knowledge about their goals and functionality.
๐จ While this is a good idea, we show a surprising result: systems that use external environments like this perform much better when given a small, fixed-size, in-context, agent-managed cache that "๐ฑ๐ฆ๐ฆ๐ฌ๐ด ๐ช๐ฏ๐ต๐ฐ" these environments.
๐ Our paper, ๐ฃ๐๐๐: ๐ ๐จ๐ฎ๐จ๐ฉ๐๐ข ๐๐ค๐ง ๐๐ช๐๐ก๐๐๐ฃ๐ ๐๐ฃ๐ ๐ข๐๐๐ฃ๐ฉ๐๐๐ฃ๐๐ฃ๐ ๐ฎ๐ป ๐ผ๐ฟ๐ถ๐ฒ๐ป๐๐ฎ๐๐ถ๐ผ๐ป ๐ฐ๐ฎ๐ฐ๐ต๐ฒ ๐๐ค๐ง ๐๐๐ ๐๐๐๐ฃ๐ฉ๐จ, introduces this idea.
Compared with strong baselines, including RAG, Compaction Agents, and SOTA prompt-learning frameworks, PEEK dominates the costโquality Pareto frontier: achieving 6.3โ34.0% in quality, with fewer iterations and lower cost.
Paper:
arxiv.org/abs/2605.19932
GitHub:
github.com/zhuohangu/peek
More in the thread below! (1/N)