🚨 Bittensor
$TAO Subnet 6
@numinous_ai just published something that should make the entire AI industry pay attention.
Their top miner is beating Google's Gemini on a live, transparent forecasting benchmark.
Not in a lab. Not in a press release. In real time, on a public leaderboard, scored by one of the most competitive metrics in probabilistic forecasting in the world.
Look at what the data says and see for yourself.
Benchmark: Brier Score
If you're unfamiliar with Brier scoring, here's what you need to know.
It's not did you get it right. It's how confident were you, and were you right?
A model that says 90% on a coin flip and wins still gets penalized. A model that says 51% on an uncertain outcome and wins barely moves the score.
It rewards calibration. It rewards precision. It punishes overconfidence and randomness equally.
This is the metric intelligence agencies and quantitative hedge funds use to evaluate forecasters. It's the gold standard. You cannot game it at scale over hundreds of events.
The Numbers: Read These Carefully
Top Miner (UID 128): 0.1772 Brier Score. 71.8% accuracy. 600 events.
Gemini baseline, same questions, same window: higher Brier (worse).
For context: the difference between good and great in forecasting benchmarks is often measured in the fourth decimal place. Getting to 0.177 over 600 events is not luck. That's a system that has learned to be right and to know when it's right.
71.8% accuracy with a Brier that low means the model isn't hedging to 50/50. It's making directional calls with conviction and getting them right at a rate that moves the score meaningfully.
221 agents are competing on this network simultaneously. Open source. Transparent. Scored the same way for every miner. No cherry-picking. No selective disclosure.
The best one is beating Gemini.
Again This Is the Bittensor Thesis Made Real and No one talks about it
Here is the argument that skeptics have made about decentralized AI for years:
"A network of anonymous miners competing on a blockchain cannot produce AI outputs that match what Google, OpenAI, or Anthropic can build with billions of dollars and the world's best engineers."
The Numinous leaderboard is a direct empirical refutation of that argument in one specific domain, on one rigorous metric, over a meaningful sample size.
Not we think we can compete. We have the chart. We have the numbers. Look at it.
The Bittensor thesis isn't that decentralized AI beats centralized AI at everything simultaneously. That's not how specialization works. The thesis is that competitive markets produce superior specialized outputs in specific domains when the evaluation is transparent and the stakes are real.
Forecasting is one domain. The top miner won.
The competition is self-reinforcing: 221 miners see the leaderboard, learn from what's working, iterate, improve. Every epoch the baseline rises. Open source selection pressure means the collective intelligence of the entire network compounds permanently. Gemini's forecasting doesn't get better because someone improved a miner's CUDA kernel last night. Bittensor's does.
What the Architecture Looks Like
This isn't just ask an LLM a question and score it. Numinous's architecture is worth understanding.
Miners aren’t submitting predictions.
Validators run that code in sandboxes with curated data tools, and the agent searches, updates, and outputs a probability. Blind scoring across 600 events.
That’s automated superforecasting code: the best analyst methodology, running 24/7 at machine speed.
Stack:
Data: Desearch (SN22)
Compute: Chutes (SN64)
And it’s monetizable right now: Eversight sells the probabilities via API to traders/institutions.
Top miner’s Brier curve sat below Gemini’s for a full week. Public proof that a decentralized market of agents can beat a top centralized model in a measurable domain.
Ask Eversight a question. Compare it to Polymarket.
$TAO |
@numinous_ai |
leaderboard.numinouslabs.io