A 0.6B model learned to manage giants.
That is the idea behind TRINITY, a new ICLR 2026 paper by Jinglue Xu, Qi Sun, Peter Schwendeman, Stefan Nielsen, Edoardo Cetin, and Yujin Tang.
The paper is not asking:
“How do we build one model that knows everything?”
It is asking something more interesting:
“How do we build a small intelligence layer that knows who should think, who should act, and who should verify?”
TRINITY is a lightweight coordinator for LLMs.
It does not merge weights.
It does not require architectural compatibility.
It does not need access to closed-model internals.
It does not try to turn the coordinator into the smartest model in the room.
Instead, it orchestrates a pool of strong models at test time, including closed and open models.
At each turn, TRINITY chooses a model and gives it one of three roles:
Thinker — plan and decompose
Worker — solve and execute
Verifier — critique and accept/revise
That may sound simple.
It is not.
Too many multi-agent systems are still prompts plus hope.
TRINITY learns the coordination policy.
A compact ~0.6B language model produces hidden-state representations of the conversation. A tiny head then uses those representations to decide the next model-role pair. The authors optimize this coordinator with an evolutionary strategy, sep-CMA-ES, because the problem is expensive, high-dimensional, and reward-sparse.
The result is not just better routing.
It is learned division of labor.
The paper reports that TRINITY outperforms individual models and existing coordination methods across coding, math, reasoning, and domain knowledge tasks. In its full-power setting, it reaches 86.2% on LiveCodeBench and transfers to held-out benchmarks including AIME, BigCodeBench, MT-Bench, and GPQA-D.
The most important idea here is bigger than the benchmark.
The future of AI may not be a single supermodel.
It may be an organization of models.
A small conductor.
A team of specialists.
A protocol for planning, execution, and verification.
An intelligence layer that learns how to allocate cognition.
This feels like a real shift:
from bigger models
to better systems
from raw capability
to coordinated capability
from “which model is best?”
to “what structure makes many models better together?”
Full credit to the authors:
Jinglue Xu, Qi Sun, Peter Schwendeman, Stefan Nielsen, Edoardo Cetin, Yujin Tang.
Paper: TRINITY: An Evolved LLM Coordinator
arxiv.org/abs/2512.04695
I’m attaching the first page because the abstract is worth reading closely.
The future of AI may not be monolithic.
It may be coordinated.
#ArtificialIntelligence #LLM #MultiAgentSystems #MachineLearning #EvolutionaryAlgorithms