Our latest audio model, Gemini 3.5 Live Translate, takes real-time speech translation to the next level for developers by delivering low-latency translation across 70 languages.
By processing speech as it streams in near real time, the model enables devs to build low-latency audio experiences with:
— Multilingual input: Understands multiple languages in a single session without needing to adjust settings.
— Auto-detection: Identifies the spoken language and begins translation instantly.
— Native audio processing: Generates more natural-sounding speech that preserves speakers' intonation, pacing, and pitch.
— Noise robustness: Filters out ambient noise for clearer conversation in loud environments.