Joined March 2013
254 Photos and videos
๐Ÿ”ฅGLM 5.2 vs Kimi K2.7. Which one is better? Will test it soon. What's your thoughts?
46
2
298
37,845
Bye bye Fable.
1
183
K2.6 was the best Chinese model until today. Now K2.7 is even better. Amazing work.
๐ŸŒ˜ Kimi-K2.7-Code, our latest coding model, is now released and open-sourced! ๐Ÿ”ท Improved coding & agent performance over K2.6: 21.8% on Kimi Code Bench v2, 11.0% on Program Bench, and 31.5% on MLS Bench Lite. ๐Ÿ”ท Reasoning efficiency: Less overthinking, with 30% lower reasoning-token usage compared to K2.6. ๐Ÿ”ท Long-horizon coding: Improved instruction following, higher end-to-end coding task success rates. โšก๏ธ 6x High-Speed Mode coming soon! ๐Ÿ”Œ Available today via Kimi API and Kimi Code. ๐Ÿ”— Kimi Code: kimi.com/code ๐Ÿ”— API: platform.moonshot.ai
2
8
917
So Fable 5 is more expensive than 5.5 xhigh, but at the same level.
Jun 9
Tldr Fable 5 low is solving in the least steps and is a great cost per task
4
11
1,562
๐Ÿ˜žYesterday I realized that 5.3 codex is removed. This is very sad. 5.3 was great for fixing bugs or small features. And it was very cheap. I hope we will get something in return.
2
166
Second iteration much better, but still issues with animations.
๐Ÿš€First try MiniMax 3. Not bad. Animations are a bit clunky, but overall it proposed me interesting design. Although I prefer what Kimi K2.6 proposed.
228
Current M3 pricing.
1
2
226
๐Ÿš€First try MiniMax 3. Not bad. Animations are a bit clunky, but overall it proposed me interesting design. Although I prefer what Kimi K2.6 proposed.
1
4
839
We have it! Looks fantastic.
Introducing MiniMax M3: The First Open-Weights Model to Combine Three Frontier Capabilities - Coding & Agentic Frontier: 59.0% SWE-Bench Pro, 66.0% Terminal Bench 2.1, 34.8% SWE-fficiency, 28.8% KernelBench Hard, 74.2% MCP Atlas - MiniMax Sparse Attention scales context to 1M - Natively Multimodal from Step Zero API: platform.minimax.io Token Plan: platform.minimax.io/subscribโ€ฆ ๐Ÿš€New! MiniMax Code: code.minimax.io Weights & Tech Report in ~10 Days
1
8
297
Playing with new TV(middle one). My wife loves the new bedroom design for the next two weeks - 10/10. My wife happiness - 10/10 My happiness - 10/10. Only one score is true. Guess which one. ๐Ÿ˜‚ BTW new soundbar is coming on Monday. ๐Ÿ™ˆ
1
4
611
Did Opus 4.8 beat GPT-5.5? Please tell me because I don't use anything from Anthropic.
116
365
105,648
I wonder what that could be! ๐Ÿค” Can't wait.
Something BIG is coming
257
๐Ÿ‡จ๐Ÿ‡ณXiaomi increased mimo usage. This is DeepSeek effect. Most companies reduce usage. Deepseek and xiaomi increase or reduce price. Win for us.
๐Ÿš€ Better inference efficiency, lower costs, broader access. MiMo-V2.5 Series API pricing is now permanently reduced โ€” by up to 99% compared to previous pricing. โœจ Unified pricing across all context lengths. MiMo Token Plans have also been upgraded: โ€ข 5โ€“8ร— more usable tokens at the same price โ€ข Simpler and more transparent billing rules ๐ŸŽ As a thank-you to current users, all current Token Plan credits will be fully reset. ๐ŸŽง MiMo-V2.5-TTS remains free for a limited time. โฐ Effective May 26 at 6:00 PM PDT. These improvements are powered by continued inference optimization and serving efficiency upgrades across the MiMo stack. ๐Ÿ› ๏ธ Weโ€™ll also publish a detailed technical blog on the inference optimizations later โ€” stay tuned.
1
51
2,370
Unbelievable. For this price it's amazing choice for coding and Hermes agent.
We are making our discount permanent! ๐ŸŽ‰ Enjoy building with DeepSeek-V4-Pro and bring your innovative ideas to life! ๐Ÿš€
1
16
955
MiniMax 3.0 this week please?
1
2
270
Such an amazing terminal score. ๐Ÿ˜ฎ
May 19
Replying to @Google
Gemini 3.5 Flash is built to help you execute complex, agentic workflows. 3.5 Flash rivals flagship models to deliver frontier performance for agents and coding, at the lightning speeds you expect from the Flash series.
1
178
8gb vram for local models? Still usable. You don't have to start with rtx 6000 pro.
running Hermes locally with Qwen 3.6-35b-a3b is possible on a RTX 4060 Ti 8GB. my params are: ~~~ llama-server \ -m ~/llama-models/Qwen3.6-35B-A3B-UD-Q4_K_M.gguf \ -ngl 999 -ncmoe 30 -fa on \ --cache-type-k q8_0 --cache-type-v q8_0 \ -c 32768 -n 8192 -np 1 -t 6 \ --reasoning off \ --no-cache-prompt --checkpoint-every-n-tokens -1 \ --jinja --metrics --host 0.0.0.0 --port 8080 ~~~ biggest flaws: - context: if you are coding, a few prompts will eat it all - speed: it took 17min to create a medium-difficult .py file but it works! I'm going to test /goal feature as well, to see how Qwen handle multiple compactions and see if it can finish a goal.
5
375
Great insight about mtp vs gguf.
2
3
295
So for less than 3090 price you can get 4x1080ti and have 44gb vram. Speed is quite good, tdp 1000w, but it can be reduced to 800w with no performance loss. 2x1080ti with 22gb vram is not bad too. Long live 1080ti. ๐Ÿฅณ
Really good speeds on 1080 ti. Wow.
1
1
404
Crazy times.
113