We evaluated Fable prior to its release but spent the last two days double-checking the results as we couldn't believe how good they were
A more thorough analysis will follow, the results (particularly the solution to the Frogsgame task) deserve it!
Claude Fable 5 ranks #1 on FrontierSWE. This represents the biggest capability jump we have observed since releasing the benchmark
On many tasks, Fable 5 works productively for close to 20 hours and fully saturates tasks that were effectively out of reach for earlier models