Why don’t you compare against codex-5.3-xhigh
The latest Deep Think moves beyond abstract theory to drive practical applications.
It’s state-of-the-art on ARC-AGI-2, a benchmark for frontier AI reasoning.
On Humanity’s Last Exam, it sets a new standard, tackling the hardest problems across mathematics, science, and engineering — making it a genuine collaborator for heavy-duty analysis.
It achieved an Elo of 3455 on Codeforces, demonstrating the ability to solve complex, real-world coding tasks - while earning gold medal-level results on the written portion of the 2025 Physics and Chemistry Olympiads.