This is the benchmark to watch today after the launch of Claude Opus 4.7.
BridgeBench Reasoning.
Claude Opus 4.6 sits at #4.
Behind Grok 4.20 Reasoning, GPT 5.4, and Grok 4.20 Non-Reasoning.
If Claude Opus 4.7 doesn't leapfrog all three, Anthropic has a problem.
Results coming the moment the model drops.