🌅 Alice Protocol — First Real Loss Descent
Tonight, after a 17-hour forensic marathon, I can share what I have been waiting months to say: the model is starting to learn.
━━━━━━━━━━━━━━━━━━━━━━━━━━━
The bug
The Plan B protocol had an inverted sign convention. Aggregated pseudo-gradients were being applied in the wrong direction, which means the network had been slowly damaging the model for roughly 230 epochs. The loss wasn't stuck — it was actively regressing against random initialization. Every hour of mining was making the model worse.
The fix is one sign flip in the parameter server, plus protocol-level enforcement so this class of bug cannot silently recur. Deployed, tested, running.
The signal
Since the fix went live, loss is descending. Not flat. Not noise. Not a measurement artifact. Genuinely moving in the right direction for the first time in the project's history.
Three independent measurement paths agree: miner-side batch loss, global validation loss, and held-out evaluation on shards the model has never seen. Multi-source alignment on held-out data is what separates real learning from overfitting, noise, or measurement error.
The caveat
This is initial data. Real confirmation needs roughly 24 more hours of continuous training, and beyond that, a longer window of stable descent to call it durable. I am deliberately not attaching raw numbers tonight. Numbers without context mislead; numbers with context are premature.
I have watched this project produce too many false dawns to celebrate one more. But for the first time in months, the curve is bending, and it is bending in the right direction.
The stakes
Decentralized from-scratch pretraining of a 7B parameter model — no privileged cluster, no central training authority, a heterogeneous volunteer GPU network — is one of the hardest unsolved problems in modern ML. The published literature is thin. Most results exist only at small scale.
If the trajectory holds through the next day, Alice will have produced, to my knowledge, the first publicly verifiable demonstration of meaningful from-scratch loss descent on a 7B parameter model trained entirely across a decentralized volunteer network.
I am not making that claim tonight. I am saying we are close enough that the claim is becoming real.
To the miners
To every operator who kept a rig online through a rough day, who pulled the update, who sent a screenshot, who asked a hard question: this happened because you stayed. The fix was in the protocol. The proof is in your GPUs.
I am going to sleep now and trust the network. More updates as the data arrives over the next 24 hours.
Not all intelligence bends the knee.