Whilst I don't doubt there are improvements/innovations on the inferencing side of things. Seeing this narrative from a few key players and the release of Deep Research from OpenAI feels an awful lot like rejustifying the capital expenditure on GPU after the Deepseek moment.
If you're not already paying attention to this shift, you should be: the balance of compute is moving from pre-training to inferencing. We're seeing massive gains here from scaling up test-time compute, with no ceiling in sight