OpenAI CEO Sam Altman on why AI models are learning to slow down and "think" before they answer:
Altman explains that one of the earliest surprises with GPT models was how a simple instruction changed everything:
"One of the things that got people really excited in the early days of the GPT models was you could get better performance by telling the model, let's think step by step and it would then just output text that was thinking step by step and get a better answer."
He says reasoning models take that idea much further. By breaking a question into pieces, the model can spend more time on each part.
To explain how this works, Altman compares it to his own thought process:
"When you ask me a question, if it's a really easy question, I might just fire back like almost on reflex with the answer. But if it's a harder question, I might think in my head and have like my internal monologue go and say, 'Well, I could do this or that or maybe, you know, this will be clearer. I'm not sure about that.' And I could like backtrack and retrace my steps."
Only after that internal deliberation does he deliver a clean answer:
"When I finish thinking and I've, you know, been thinking in English, I can then make some bullet points and then kind of like output an answer to you in English."
@sama shares an observation from using the app himself—asking a deep research question and watching it keep working even after he's locked his screen.
He recalls another company's approach to measuring this:
"I heard somebody, another company... I think it was Anthropic, said, 'Hey, this model actually spent like 15 minutes or 30 minutes or whatever length of time to think about a thing,' which is a good metric, but it needs to actually give you the right answer."
This led to a realisation that went against his every instinct:
"All of my instincts have been the instant response is the thing that matters and users hate to wait. And for a lot of stuff, that's true. But for hard problems with a really good answer, people are quite willing to wait."