making AI that runs entirely on your Mac, no cloud, no API bills | obsessed with good design | sophomore @IITMadras | views my own (I'm unemployed)

Joined December 2025
77 Photos and videos
WHAT THE HELL is happening in AI? A 3B parameter model just put up coding benchmark scores in the same league as Claude Opus 4.5. 3 BILLION. The weights are on Hugging Face, anyone can test it. I genuinely don't know if this is a breakthrough or if the benchmarks are broken.
133
95
1,717
256,081
And it might not be benchmaxxed... (though im testing the model on some benchmarks which the original paper didn't run, just to confirm) x.com/Gurprets225/status/206…

Replying to @orcus108
It's not gaming the benchmarks. If you read the paper, it focuses on math which is a narrow domain and easy to build datasets on. This is why it sucks on general domain knowledge, cuz it was trained for math only. It also has a 96% acceptance rate for leetcode problems after its training data which means it's doing well on problems it's never seen which points less to gaming the benchmarks. TL;DR: Not benchmark maxed, just trained on a narrow topic and it does very well on that topic and very poorly on anything else.
3
2,164
orcus108 retweeted
WHAT THE HELL is happening in AI? A 3B parameter model just put up coding benchmark scores in the same league as Claude Opus 4.5. 3 BILLION. The weights are on Hugging Face, anyone can test it. I genuinely don't know if this is a breakthrough or if the benchmarks are broken.
133
95
1,717
256,081
can someone explain why the cost graph is inverted?? makes no sense to me
Fable on DeepSWE Disagree with this one tho. GPT 5.5 xhigh is great but def not as good as Fable was.
1
1
988
"musk isn't stopping at 1T models" he's actually gonna make a Le Chaton Fat isn't he
SpaceX acquired Cursor I expect SpaceX AI to be between Google and OpenAI by end of year Composer 2.0 was a very strong model for only 1T params, but Elon isn't stopping at 1T models
119
some things im curious about / wanna do 1. build open code from scratch 2. understand how chip design works 3. build an LLM inference engine 4. start using Hermes agent (or any good agent) 5. understand edge/local AI 6. getting a nice community on X
1
88
can someone at sarvam pls put the "powered by" on the second line so its one sentence per line @thedesignobsess @SarvamAI
69
orcus108 retweeted
OH MY GOD its happening @MistralAI has officially confirmed the upcoming release of Le Chaton Fat - 30T MoE with 256 experts - 1M context window - multimodal and multilingual - outperforms every other model (including Fable 5) on every benchmark
Jun 14
Le Chaton Fat
100
65
1,370
507,643
India's first rocket launched from a church. Its components were transported on bicycles and bullock carts. We remember how ISRO ended up but we forget how it started. Maybe the question isn't how India catches up in AI. Maybe it's what race India should run first.
4
915
LMAO is he playing along or did he actually fall for it😭
They can’t keep getting away with this.
185
curious to see how it works with the smaller models. Could a fusion of, say, Qwen3-27B reach Sonnet/Opus/5.5 level?
Introducing the Fusion API, the smartest compound model in the market. Fusion achieves Fable-level intelligence at half the price. How it works 👇
115
orcus108 retweeted
most people building on frontier AI have been operating on an assumption that broke last night. access to the most capable models isn't a utility you subscribe to - it's a privilege that can be revoked for hundreds of millions of users, developers, and startups with essentially no process. not just for people outside the US, but for everyone, including American nationals themselves. the scarier part isn't even that it happened. it's that there's no real mechanism preventing it from happening again, to any model, at any time, for any reason that gets dressed up as national security. the "just switch to open-source Chinese models" response misses something very important. right now, Chinese labs release open weights partly because it's a competitive weapon - it compresses American labs' margins and guarantees China access to capable models regardless of export controls. but that calculus changes the moment China reaches frontier parity. why keep giving away your best models when the US is hoarding its own? the latest big Qwen releases are already closed. the window where open-source AI functions as a global equalizer is closing from both ends at once, and faster than most people think. if you can't run the software on your own hardware, assume it can be taken away at any moment.
The US government, citing national security authorities, has issued an export control directive to suspend all access to Fable 5 and Mythos 5 by any foreign national, whether inside or outside the United States, including foreign national Anthropic employees. The net effect of this order is that we must abruptly disable Fable 5 and Mythos 5 for all our customers to ensure compliance. Access to all other Claude models is not affected. We apologize for this disruption to our customers. We believe this is a misunderstanding and are working to restore access as soon as possible. Read our full statement: anthropic.com/news/fable-myt…
1
2
3,026
how to become “cracked” at something
43
A LEGEND FORGED IN SILVER RESUMES IN RED❤️❤️
Jun 14
FIRST WIN IN RED ❤️ #F1 #BarcelonaGP
110
absolutely must must read
1
105