MinMax3 just dropped!
SWE Bench Pro: 59.0%
Terminal Bench 2.1: 66.0%
SVG Bench: 63.7%
BrowseComp: 85.5%
GDPval Rubrics: 74.7%
MCP Atlas: 74.2%
OSWorld Verified: 70.0%
I am in disbelief that they’re open sourcing a model that beats both Opus and GPT 5.5 on BrowseComp and SVG Bench, while also beating GPT 5.5 on SWE Bench Pro, KernelBench Hard, and BankerToolBench, and beating Opus on OSWorld Verified.