What stands out in Charlie’s Codex/ty memory-reduction work is not that single dramatic patch and 25% gain, but the operating pattern used!
Set a measurable goal, let the agent keep searching, and accumulate small reviewable wins. Little by little gaining the big win.
The patches as far as I can see are tedious performance engineering to avoid retained queries when syntax is absent, or share duplicated parameter data, and fast-path fixed descriptor behavior, with removing redundant inference paths.
Each change is narrow, measured, and mergeable (Best use for /goal IMO)
This is what the real shift feels like.
Agents may not replace deep engineering judgment immediately, or ace benchmarks measuring merge-ability.
They loop in to keep finding 0.5%, 1%, and 2% improvements until the total becomes hard to ignore.
Since my last post, I reduced ty’s retained memory by another 15% with Codex. We're now at a ~25% memory reduction overall via /goal, largely in the background.
I love working with the GPT models so much.