Open frontier intelligence, in your hands - the MiniMax-M3 PRISM Dynamic-Quant recipe is ready! 428B parameters compressed from the ~450GB MXFP8 release down to 119GB, per-tensor sensitivity ranking that protects attention and shared-expert paths (3.4–4.5 bpw) squeezing the two largest routed-expert tensor blocks. Still too large for most local rigs, so next we prune the experts that won't harm agentic/coding performance collapse. Target: 60–80GB.
We heard you wanted to use Codex rate limit resets on your own time.
Starting today, we’re rolling out the ability to save rate limit resets to use later.
We’re starting Go, Plus, Pro, and Business users with one free reset:
If you wanted to try out fable but are scared of blowing away your token budget do we have a skill for you!
Its called Thrifty. It makes things cheaper.
We built thrifty for claude code around a better split:
strong model plans → cheap model executes → gate verifies.
So far Fable is making VERY good decisions in an incredibly large monorepo and is drastically reducing token spend and increasing accuracy over 4.8 which was also already very good.
Working on my security app, authz, secrets stores and honey tokens without issues. Deploying to production, diagnosing production issues, smoke testing and evidence gathering. Across the board just better (so far).
grok-build is incredibly fast but not very useful unfortunately. it refuses to do even the simplest self validation, it is the laziest model I have worked with in a long time, and you'll spend more time correcting and handholding it than you will waiting from tokens on local 122b
I can tell it wants to be a good model but it's too willing to please and too unwilling to keep going and hold itself to a standard of behavior that leads to good results