I tested
$META's Muse Spark over the last few hours and came away net positive. 3 main takeaways:
1) Quality: It's a very good model. Not quite frontier but good. It showed comparable performance vs Opus 4.6 across web data search, PDF parsing, and general knowledge/conversation/writing. It's worse at coding. Both models solved an easy coding task, but Muse Spark failed the hard one while Opus one-shotted. The image-gen is also worse than ChatGPT. But all in, it's a legitimate and usable general model. Lots of room to develop the UI further (ie it should show a map when recommending local restaurants) but the underlying model itself is impressive.
2) Speed: Notably, Muse Spark answered almost instantly while Opus 4.6 felt borderline unusable at times. I'm a huge Anthropic fan but latency has become a major issue. Simple answers take too long and multi-step agent flows break more often. Meta seems to have more available compute which is a real factor going forward.
3) Scaling: Meta hasn't published a full model card so we're working with limited disclosure. But the graphs below might be the most important part of the release. Their rearchitected pretraining stack shows a near-linear relationship between RL compute and accuracy. If that holds, Meta has a clear path to training much larger, more intelligent models. That's arguably more consequential than Muse Spark itself.
All in, it's positive. Muse Spark is a good, usable model, it's being served smoothly, and Meta looks to be on an encouraging trajectory.