Filter
Exclude
Time range
-
Near
What if an LLM could EDIT its own tokens in real-time, not just generate them? 🤯 Introducing LLaDA2.1 — a diffusion model that breaks from autoregressive dominance. It drafts fast, then fixes its own mistakes on the fly with Token-to-Token editing. The result? 892 tokens/sec on a 100B model. 🔥 ⚡ 892 TPS on HumanEval (coding) ⚡ 801 TPS on BigCodeBench 🧠 Real-time self-correction via T2T editing ✅ @lmsysorg SGLang Day 0 support — production-ready now A "non-consensus" architecture now challenging the mainstream. Open-sourced TODAY. 👇 #LLaDA #TokenEditing #OpenSource #LLM #dLLM
48
83
375
370,898