Artificial Intelligence

Joined March 2021
682 Photos and videos
GLM 5.2 seems to be a significant improvement over GLM 5.1 and has a Max Thinking option
6
9
253
18,451
New GLM-5.2 rumors: Beta testing appears to have started for some Max plan users. GLM-5.2 seems to have a 1M token context window, no multimodality, and two thinking intensity settings
15
4
243
20,159
Claude Fable 5 (xHigh) scores 70% on DeepSWE, matching GPT-5.5 (xHigh), from Theo’s recent video
15
8
254
19,955
CursorBench scores for Claude Fable 5 Fable 5 Low has the same score as GPT-5.5 Extra High
7
8
227
20,393
Claude Fable will cost $10/$50 per million input/output tokens Mythos Preview was priced at $25/$125 per million input/output tokens
Scoop: A neutered version of Mythos called Claude Fable is coming today. It's expensive—2x the price of Opus—but perhaps not as pricey as people might have thought from the initial Mythos pricing (5x Opus). More on that and Apple WWDC in AI Agenda: theinformation.com/newslette…
4
3
62
33,987
GLM 5.2 seems to be coming soon Another user on Reddit also noticed that trying to call GLM 5.2 returns a “no access” error, while trying to call GLM 6 returns a “model does not exist” error Haven't verified it myself yet
Well, well, well. What do we have here? Wild GLM-5.2 appeared in Coding Plan. It's inaccessible, yes, but it's coming very-very soon.
9
15
277
35,525
A user on Hacker News also reported a few hours ago that Fable 5 would be released tomorrow Sources are also now reporting the same
Jun 9
New Claude model checkpoints (Possibly Mythos GA) - Claude Fable 5 - Claude Fruitcake EAP The new checkpoints were detected for testing over the weekend.
5
1
192
18,714

Sources: Anthropic is planning to release a public version of Mythos tomorrow - Will have substantial guardrails and not be as cyber permissive as what Project Glasswing partners can access - Will be dramatically better at long-horizon, multi-turn tasks sources.news/p/inside-apples…
9
1,808
Kindle (GPT-5.6) has been removed from the Arena A new model, Levi, appeared shortly after Kindle was removed. The model’s front end output looks similar to OpenAI models with the Design skill. Levi might also be GPT-5.6 Here is a comparison with GPT-5.5 Prompt: “Create a website about the upcoming World Cup”
🚨 A new anonymous model under the name "Kindle" has been added to the Design Arena, very likely to be the same "kindle-alpha" GPT-5.6 Release Candidate checkpoint previously revealed As foretold! It's coming
7
4
134
59,125
"Claude-Mythos-5" did just briefly show up in the API. It’s coming soon. I wonder if they’re going with the pricing from the Glasswing blog post: $25/$125 per million input/output tokens. That would make it 5 times as expensive as Opus 4.8 @White1637402 was the first one to report it
Opus era is over. 'claude-mythos-5' just appeared in Anthropic's internal infra.
10
5
153
17,554
AiBattle retweeted
Seeing as Claude Mythos is releasing soon, I have two VERY astonishing outputs to share from it. 👀 ZERO-SHOT and LOW effort as well! These are the best outputs I've seen for this prompt ever since the October 2025 Gemini A/B models.
80
71
1,046
580,305
How much better are the internal, unreleased models at frontier labs like Google, OpenAI, and Anthropic? We got a glimpse exactly one year ago today, when Google accidentally leaked the “Kingfall” model "Kingfall" was likely an unreleased Gemini 2.5 Ultra-sized model. It was available in AI Studio for only a few minutes but remained accessible through the API for several days At the time, "Kingfall" appeared to be significantly better than Gemini 2.5 Pro at both code generation and creative writing In a recent interview, Sundar Pichai mentioned that Google could have made a better, Ultra-sized Gemini Omni model, but would have had trouble serving it The infrastructure required to serve Ultra-sized models at scale is likely why Google never publicly released models like “Kingfall”
6
2
103
11,868
First sighting of “Kingfall” under the Confidential tab in AI Studio a year ago today x.com/AiBattle_/status/19302…

4 Jun 2025
A new mystery model selector called 'Confidential' and a model named 'Kingfall' have appeared in AI Studio!
1
15
1,363
A bit worried about the upcoming Qwen 3.7 open source models A Qwen team member recently deleted a comment where he said they would likely release another 27B model The Summary section of the 3.7 Plus blog post doesn’t mention any upcoming open-source models, whereas the 3.6 Plus blog explicitly said they would be open-sourcing smaller-scale models We also didn’t get the other two Qwen 3.6 models from the poll, 9B and 122B I still think we’ll probably get some open-source models from the 3.7 series, but it’s unclear which sizes they’ll be or when they’ll arrive
21
14
274
28,771
MiniMax M3 scores 54.7 on the AA-Intelligence Index, beating Kimi K2.6’s score of 53.9 Once the weights are released, M3 will become the open-weights model with the highest score on the AA-Intelligence Index
13
9
220
15,295
MiniMax M2.7 scored 0% on DeepSWE. I’m really curious to see how well M3 will do The model rankings on the DeepSWE benchmark seem to reflect model performance better than other coding benchmarks
35
7
711
109,031
MiniMax is currently conducting internal CKPT testing for M3, a multimodal, long-context model The team is also resolving pipeline issues and upgrading its infrastructure In the next few days, they plan to provide CKPT/API access for developers in the open-source community to evaluate the model
MiniMax M3 即将发布,想邀请一些中文开源社区的 contributor 来评测,阿岛 @SkylerMiao7 建了一个飞书群,可以第一时间体验到! 另外希望申请者有一些开源项目的贡献经验(贡献过开源项目或者有自己的开源项目),在验证信息里面注明就行。
2
8
162
16,794
Claude Opus 4.8 has the highest score on the Artificial Analysis Intelligence Index with a score of 61.4
7
13
322
30,674
Big improvement on CritPt compared to Opus 4.7: 12% --> 21%
23
1,836
Opus 4.8 shows major gains over Opus 4.7 on GraphWalks at 1M context length
1
9
153
5,972
1
9
837