Google DeepMind releases Gemini 3.5 Live Translate for near real-time speech-to-speech translation in over 70 languages
The new audio model integrates with Google AI Studio, Google Translate, and Google Meet, enabling fluid, low-latency translation while preserving speaker intonation and handling noisy environments.
1 source · cross-referenced
- Gemini 3.5 Live Translate is a new audio model from Google DeepMind that delivers near real-time speech-to-speech translation in over 70 languages.
Google DeepMind announced the release of Gemini 3.5 Live Translate, a new audio model designed for near real-time speech-to-speech translation in over 70 languages. The model automatically detects languages and generates translated speech that preserves the speaker’s intonation, pacing, and pitch, aiming to reduce awkward pauses and maintain synchronization with the speaker.
Unlike turn-based translation systems that wait for the speaker to finish before responding, Gemini 3.5 Live Translate processes speech continuously, balancing the need for context with the demand for immediacy. The model is reported to stay just a few seconds behind the speaker throughout a session, delivering fluid audio output.
The model is rolling out across multiple Google products: developers can access it in public preview via the Gemini Live API and Google AI Studio; enterprises can preview it this month in Google Meet; and it is available to all users via the Google Translate app on Android and iOS.
Gemini 3.5 Live Translate is designed to handle multilingual inputs without manual configuration and includes noise robustness features to function in loud or unpredictable environments. Google highlights potential use cases such as live interpretation for multilingual calls, meetings, lessons, and broadcasts.
Google also notes partnerships with companies like Agora, Fishjam, LiveKit, Pipecat, and Vision Agents, which are integrating the model to build voice translation applications. These partners manage real-time media streaming infrastructure, allowing developers to focus on user experience.
Early adopters, including Grab, CJ ENM, and LiveKit, have provided positive feedback on the model’s translation quality, accuracy, and low latency. Grab, which facilitates over 10 million voice calls per month, is testing the model to enable near real-time multilingual communication between drivers and travelers.
In Google Meet, the update expands language support from five to over 70 languages and increases the number of supported language combinations from English-only to over 2,000 combinations. The interface is also updated to provide instant access to speech translation, with a private preview for select Google Workspace customers starting this month and a broader rollout planned later this year.
- Jun 16, 2026 · Google DeepMind — Blog
Google DeepMind releases DiffusionGemma, an experimental open model for 4x faster text generation
Trust79 - May 22, 2026 · arXiv cs.AI
New Method Improves LLM Reasoning About Conflicting Beliefs in Complex Social Scenarios
Trust79 - May 20, 2026 · OpenAI — News
OpenAI model resolves 80-year-old discrete geometry conjecture
Trust67