mycelial structure
Recursive Language Models, starting at MIT's groundbreaking paper and evolving into Prime Intellect’s RLMEnv for long-horizon agents.
Rather than directly ingesting its (potentially large) input data, the RLM allows an LLM to use a persistent Python REPL to inspect and transform its input data, and to call sub-LLMs from within that Python REPL.
Prime Intellect believes the simplest, most flexible method for context folding is the Recursive Language Model (RLM).
So Prime Intellect basically implemented “a variation of the RLM” as an experimental RLMEnv inside their open-source verifiers library, so it becomes plug-and-play inside any verifiers environment.
The big idea here is pretty simple: stop trying to cram “everything so far” into 1 giant context window, and instead give the model a way to work on the outside using code and extra model calls, while keeping the main model’s own context short and clean. That is what they mean by a Recursive Language Model (RLM).
In their setup, the main model sits on top of a persistent Python REPL, and it can also spin up sub-LLMs (fresh copies of itself) using a batching function so it can run lots of small jobs in parallel.
The key trick is that tool use is only allowed for the sub-LLMs, not the main model, because tool outputs can explode into huge token dumps. So the main model stays “lean”, and it delegates the messy, token-heavy stuff to sub-LLMs and Python.
A detail that’s more important than it sounds is how they force discipline. Extra input data does not automatically land in the model’s context. It sits in Python, and the model only sees what it chooses to print, and even that printout is capped at 8192 characters per turn by default.
So if the model wants to deal with a massive PDF, dataset, or long transcript, it has to use Python to slice and filter, and it often has to ask sub-LLMs to scan chunks and return short answers.