🚨 @ZAI_ORG JUST DROPPED GLM-5.2, AND IT IS PUNCHING RIGHT AT THE LEVEL OF CLAUDE OPUS 4.8 🤯
The kicker?
It’s a 753B parameter model with a true 1M-token context, released fully open-source under an MIT license
What makes this release technically interesting:
→ IndexShare Attention: They reuse a single indexer across every 4 sparse layers, cutting per-token FLOPs by 2.9× at a 1M context.
→ Better Speculative Decoding: An improved MTP layer increases acceptance length by up to 20%.
→ Adjustable Compute: Flexible thinking-effort levels let you explicitly trade off between performance and latency.
It’s also Day-0 ready.
You don’t have to wait for the ecosystem to catch up, it’s already supported in transformers, vLLM, and SGLang.
Repo, weights and paper in 🧵 ↓
Reporter: ‘Are you the one who makes decisions about Israel, or is it Trump and America?
Netanyahu: ‘I set certain parameters for our activity and we are doing it, but we cannot completely go against what the U.S. says’
Why do Apple Intelligence models only support the iPhone 17 Pro?
I was interested how Apple managed to fit a 20B parameter model into the iPhone 17 Pro, a phone with only 12 gigs of RAM
Turns out they built a custom architecture only for Apple Silicon that allows only parts of the model to be loaded into memory. It's a twist on MoE, where they don't need the full model to be used!
According to Apple, a smaller model selects a fixed set of experts during initial processing. Experts that aren't needed aren't pulled into memory. "The 'routed experts' are swapped in when needed."
That's insane: they made it so that they don't even need the full model in use, but when they need other parts of it, they swap it out. Imagine a car that can swap out the engine while driving, depending on when it needs to be performant or efficient.
And since it's a multimodal model, that's what runs the new Siri voice and advanced dictation. That's why you can't get the new Siri voice on older iPhones: they don't have enough memory or bandwidth.
I thought that was pretty neat, and I wanted to share. You can read more at machinelearning.apple.com/re…
In the latest weekly bonzo bytes newsletter from @bonzo_finance
One of the governance proposals is the DOVU Risk Parameters Update.
It suggests safely expanding liquidity capacity for $DOVU, while preserving robust liquidation margins, adjusting supply caps and borrow limits, as well as adjusting interest rates for the $DOVU asset.
It’s live right now, and expected to be implemented this week.
Head over to gov.bonzo.finance to register, read all about it, and share your feedback.