π¨
@QuasarModels just released Quasar-Preview on
$TAO's SN24, not a fine-tune, not a wrapper. A new architecture. The first public proof it works at real scale!
Everyone watches the benchmarks. The smart money watches the architecture.
What this actually means for anyone outside the research world:
Most AI models run on a standard Transformer, the same foundation under GPT, Claude, Gemini.
Powerful.
But it has a fatal limit: double the context, quadruple the compute. That quadratic wall is why long-context AI is still a bottleneck everywhere.
Quasar breaks this.
β’ 18B total parameters. 2B active per pass the intelligence of a large model, the efficiency of a small one. Open the right shelf without loading the whole library.
β’ Experimental 5M-token context. For comparison, most frontier models cap at 128Kβ200K. Five million tokens is every document youβve ever worked with, held in a single inference pass. Loop Transformer hybrid attention layers make it tractable where standard math gives up. Wow!
β’ At 0.1% of its full training budget, already matching Bittensorβs previous 72B dense model on MMLU, and beating it on ARC Challenge and OpenBookQA.
π Let that land folks, thats:
2B active parameters. Competitive with a 72B model. At 0.1% of training.
@TroyQuasar confirmed it himself the model has seen only 0.1% of its intended token budget.
Todayβs benchmarks are the floor of this, not the ceiling, gonna fly.
@const_reborn didnβt call it a language model. He called it a 5M context length agentic model.
That is the point: NOT a chatbot, an agent foundation designed to hold entire projects in memory, reason across hours of context, and never lose the thread. Thatβs exactly what enterprise AI actually needs.
MIT license. Open weights. Trained on Bittensorβs decentralized network no central cluster, no gatekeeper, miners competing to build state of the art.
Theyβll count the benchmarks later. Right now, watch whatβs being built at 0.1%.
$TAO
DYOR.