Filter
Exclude
Time range
-
Near
ボブにゃんVRC retweeted
🚨 @ZAI_ORG JUST DROPPED GLM-5.2, AND IT IS PUNCHING RIGHT AT THE LEVEL OF CLAUDE OPUS 4.8 🤯 The kicker? It’s a 753B parameter model with a true 1M-token context, released fully open-source under an MIT license What makes this release technically interesting: → IndexShare Attention: They reuse a single indexer across every 4 sparse layers, cutting per-token FLOPs by 2.9× at a 1M context. → Better Speculative Decoding: An improved MTP layer increases acceptance length by up to 20%. → Adjustable Compute: Flexible thinking-effort levels let you explicitly trade off between performance and latency. It’s also Day-0 ready. You don’t have to wait for the ecosystem to catch up, it’s already supported in transformers, vLLM, and SGLang. Repo, weights and paper in 🧵 ↓
2
2
11
2,269
Ille retweeted
Reporter: ‘Are you the one who makes decisions about Israel, or is it Trump and America? Netanyahu: ‘I set certain parameters for our activity and we are doing it, but we cannot completely go against what the U.S. says’
1
2
4
191
Aibaogun Izeokhai O. retweeted
Why do Apple Intelligence models only support the iPhone 17 Pro? I was interested how Apple managed to fit a 20B parameter model into the iPhone 17 Pro, a phone with only 12 gigs of RAM Turns out they built a custom architecture only for Apple Silicon that allows only parts of the model to be loaded into memory. It's a twist on MoE, where they don't need the full model to be used! According to Apple, a smaller model selects a fixed set of experts during initial processing. Experts that aren't needed aren't pulled into memory. "The 'routed experts' are swapped in when needed." That's insane: they made it so that they don't even need the full model in use, but when they need other parts of it, they swap it out. Imagine a car that can swap out the engine while driving, depending on when it needs to be performant or efficient. And since it's a multimodal model, that's what runs the new Siri voice and advanced dictation. That's why you can't get the new Siri voice on older iPhones: they don't have enough memory or bandwidth. I thought that was pretty neat, and I wanted to share. You can read more at machinelearning.apple.com/re…
2
6
95
7,238
hbar1000 ༼ つ ◕_◕ ༽つ retweeted
In the latest weekly bonzo bytes newsletter from @bonzo_finance One of the governance proposals is the DOVU Risk Parameters Update. It suggests safely expanding liquidity capacity for $DOVU, while preserving robust liquidation margins, adjusting supply caps and borrow limits, as well as adjusting interest rates for the $DOVU asset. It’s live right now, and expected to be implemented this week. Head over to gov.bonzo.finance to register, read all about it, and share your feedback.
1
13
65
2,101
Huijin retweeted
Matrix Upgraded. Swarm Expanded 🦉 Owl AI Agent OS evolved. We’ve unlocked refined models and broader execution parameters ⬩ More Agents: Expanded options ⬩ New Battlefields: Fresh deployment vectors ⬩ Amplified Yield: Higher velocity returns optimized for volatility Deploy: dapp.owlowl.ai
13
13
13
131
Replying to @ThatG4m
This is one of the best paramet I've ever seen
42