the benchmark jump is real — Composer 1 to 2 on SWE-bench Multilingual: 56.9 → 73.7. that's not iteration, that's long-horizon RL actually working.
but Composer 2 is Cursor-only. Claude Code users pay Anthropic prices, and Sonnet 4.6 still leads on SWE-bench Verified (79.6).
the pricing pressure is what matters here. $0.50/$2.50 will force Anthropic's hand on API pricing.
Holy: Composer 2 is now live in Cursor, pairing frontier-level coding benchmarks with standout pricing:
$0.50 per million input tokens and $2.50 per million output tokens.
Cursor reports major jumps over prior versions across CursorBench (61.3), Terminal-Bench 2.0 (61.7), and SWE-bench Multilingual (73.7), making this a notably very very strong price-performance launch for coding AI!
Pressure on OpenAI and Anthropic increased a lot