This is a really clear way to think about context windows. The “lost in the middle” problem, how RAG tries to compensate, and why people end up resetting instead of building on prior context all connect here.
At some point, it’s not just about how much information you include, but whether that information holds together in a way the model can actually use.
📍 I made a new drawing about Context Windows. About “lost in the middle”, how RAG affects it, tokenizers and more.
Understanding context windows help you debug and leverage LLMs most effectively. You see why people like Boris from Claude refresh the entire window at times.