Google запускає модель Gemini 3.5 Live Translate, яка розпізнає понад 70 мов для голосового перекладу

Google launched Gemini 3.5 Live Translate on June 11, 2026, integrating real-time speech translation into Google Translate and Google Meet. The model supports over 70 languages, including Ukrainian, and preserves the original speaker’s intonation and pitch while translating continuously to reduce conversational delays to just a few seconds.

Voice Preservation and the Shift to Continuous Translation

The core technical shift in Gemini 3.5 Live Translate is the move away from “wait-and-translate” architecture. Traditional systems typically wait for a speaker to complete a full sentence before processing and outputting a translation, creating a stop-and-start rhythm that hinders natural conversation. “Unlike systems that wait for a speaker to finish talking before responding, 3.5 Live Translate generates speech continuously, balancing between waiting for context to improve quality and instant translation for synchronization with the speaker. It provides a fluid sound without awkward pauses and remains only a few seconds behind the speaker throughout the session.”Google, via Detector Media Beyond timing, the model prioritizes the emotional and acoustic profile of the speaker. It captures and replicates the original speaker’s tempo, pitch, and intonation. This allows the translated audio to sound human rather than robotic, maintaining the nuance of the conversation. This development places Google in direct competition with OpenAI, which launched ChatGPT Translate earlier this year. While OpenAI’s service supports over 50 languages, Gemini 3.5 expands that reach to more than 70.

Expanding Google Meet to 2,000 Language Combinations

Google is integrating the model into Google Meet to remove the long-standing reliance on English as the primary bridge language. Previously, the service was limited to five languages. The update expands this to over 70 languages, including Korean, Swedish, Chinese, Spanish, and English.

Android Listening Mode and Environmental Noise

How to Use Gemini 3.5 Live Translate in Google AI Studio | Live Speech to Speech Translation

For mobile users, Google is introducing a specific Listening Mode on Android devices. This feature allows users to hold their phone to their ear as they would during a standard call, with the translation playing through the earpiece speaker. This removes the need for headphones in private settings. The model is designed to function in noisy environments and does not require users to manually configure language settings, as the AI automatically recognizes the spoken language. In addition to speech, Google has updated its text translation capabilities. By utilizing the Gemini model, the service now interprets slang, local idioms, and context more accurately, moving away from literal word-for-word translations to convey actual meaning.

SynthID Watermarking to Prevent AI Disinformation

To address the risks of AI-generated audio being used for disinformation, Google is implementing a safety layer called SynthID. All audio content produced by the Gemini 3.5 Live Translate model will be embedded with a digital watermark. This watermark is imperceptible to human listeners but allows software to identify the audio as AI-generated. This provides a verifiable trail for the origin of the audio, attempting to curb the potential for deepfake voice clones or manipulated recordings.

Developer Integration and the Grab Partnership

Google is extending the model’s utility beyond its own apps by offering it to developers through the Gemini Live API and Google AI Studio. This allows third-party companies to build real-time voice translation into their own software. One early adopter is Grab, which is testing the model to facilitate communication between drivers and passengers. This is a high-volume application, as Grab processes approximately 10 million voice calls per month. The integration is currently being tested in markets including Vietnam, where users can access the “Live Translate” feature in the top left corner of the Google Translate app on iOS and Android. The broader implications of this rollout suggest a move toward “invisible” translation. By reducing latency to a few seconds and preserving the speaker’s original voice, the technology attempts to remove the friction of the translation process entirely, making the AI a transparent layer in human communication.

Find more reporting in our News section.

Google запускає модель Gemini 3.5 Live Translate, яка розпізнає понад 70 мов для голосового перекладу

Voice Preservation and the Shift to Continuous Translation

Expanding Google Meet to 2,000 Language Combinations

Android Listening Mode and Environmental Noise

SynthID Watermarking to Prevent AI Disinformation

Developer Integration and the Grab Partnership

The river that “dies” in the desert never reaches the sea and yet gives life to one of

BYD Atto 3 Price Cut, Facelift to Maintain Market Lead

You may also like

Leave a Comment Cancel Reply