Voice is the new computer interface. It makes technology disappear and creates magical experiences.
To all my coworkers who spent years listening to me talking to my computer. This is it. It’s your turn now.
Introducing GPT-Realtime-2 in the API: our most intelligent voice model yet, bringing GPT-5-class reasoning to voice agents.
Voice agents are now real-time collaborators that can listen, reason, and solve complex problems as conversations unfold.
Now available in the API alongside streaming models GPT-Realtime-Translate and GPT-Realtime-Whisper — a new set of audio capabilities for the next generation of voice interfaces.