Fluid, natural voice translation with Gemini 3.5 Live Translate
Google released Gemini 3.5 Live Translate on June 9, 2026, an audio model that turns spoken words into spoken words in another language within a few seconds. It covers more than 70 languages, detects each one automatically, and keeps the original speaker's intonation, pacing and pitch so the output sounds human rather than robotic. The model ships at once across three products: Google AI Studio for developers, Google Meet for business customers, and the Google Translate app for everyday users.
Key Takeaways
- Gemini 3.5 Live Translate does speech-to-speech translation in near real time, staying only a few seconds behind the person speaking instead of waiting for full sentences to finish.
- The model handles more than 70 languages and detects them automatically, so two people speaking different languages talk without any manual setup.
- It reaches developers through the Gemini Live API in Google AI Studio, business users in Google Meet, and consumers in the Google Translate app on Android and iOS.
- Inside Google Meet, supported language coverage grows from 5 to more than 70, opening over 2,000 language combinations in a single meeting.
- Translated audio carries a SynthID watermark so the output is identifiable as AI-generated later.
- The model preserves each speaker's voice qualities and works through background noise, keeping conversations usable in busy real-world settings.
Stats & Key Facts
- #More than 70 languages supported and detected automatically by the model.
- #Google Meet language coverage expands from 5 languages to over 70 languages.
- #More than 2,000 language combinations now supported within a single Google Meet session.
- #Translations stay only a few seconds behind the live speaker.
- #Grab processes more than 10 million voice calls per month, the test base for the model.

Streaming Speech-to-Speech Translation That Stays a Few Seconds Behind
The core change is how the model handles timing.
Older translation tools wait for a speaker to finish a full sentence, then translate, which forces people to talk turn by turn. Gemini 3.5 Live Translate works as a streaming model, so it starts producing translated speech while the person is still talking and stays only a few seconds behind through the whole session.
The result is a continuous flow that feels closer to a normal conversation than a walkie-talkie exchange. Two people speaking different languages keep talking without long pauses between each turn.
More Than 70 Languages With Automatic Detection
Language coverage and setup are built to be effortless.
- ›The model recognizes more than 70 languages and identifies which one is being spoken on its own.
- ›No manual pairing is needed, so people start a conversation without choosing a source and target language first.
- ›Automatic detection lets the model switch between speakers in different languages inside the same exchange.
Natural Voice That Keeps Intonation, Pacing and Pitch
Google focused on making the output sound like a person, not a machine.
The translated voice preserves the original speaker's intonation, pacing and pitch. That means a question still sounds like a question and emphasis carries through, rather than flattening into a monotone readout.
Keeping these voice qualities helps listeners follow tone and intent, which matters in business calls, customer support and travel where meaning depends on more than the words alone.
Three Products: AI Studio, Google Meet and the Translate App
The model lands in developer, business and consumer tools at the same time.
- ›Developers get it through the Gemini Live API in Google AI Studio in public preview, starting immediately.
- ›Google Workspace business customers get it inside Google Meet in a private preview this month, with a broader rollout later in 2026.
- ›Consumers get it in the Google Translate app on Android and iOS in a global rollout, with no sign-up required.
- ›On Android, a new listening mode plays the translation through the phone earpiece for one-on-one chats, so no earbuds are needed.
Google Meet Jumps From 5 Languages to More Than 2,000 Combinations
The meeting upgrade is the largest jump in scale.
Inside Google Meet, supported language coverage grows from 5 languages to more than 70. Because every supported language can pair with every other, a single meeting now supports more than 2,000 language combinations.
For companies running multilingual teams, that turns Meet into a tool where colleagues each speak their own language and hear others in theirs, all in one call.
Built for Noisy Settings and Watermarked With SynthID
Two design choices target trust and real-world use.
- ›The model handles background noise and overlapping voices, so it stays usable in cafes, open offices and crowded streets.
- ›All audio the model produces carries a SynthID watermark, Google's method for marking AI-generated content so it is identifiable later.
- ›The watermark addresses growing concern about telling synthetic audio apart from real recordings.
Grab and Other Partners Testing the Model at Scale
Early adopters give the model a large live test base.
Ride-hailing company Grab is testing the model to help drivers and travelers talk across languages. Grab users place more than 10 million voice calls per month on its platform, a sizable real-world workload.
Google also pointed to feedback from media company CJ ENM and integration partners including LiveKit, Agora, Fishjam, Pipecat and Vision Agents, which have wired the Gemini Live API into their tools.
Frequently Asked Questions
What is Gemini 3.5 Live Translate?
It is a Google audio model that translates spoken language into another spoken language in near real time. It covers more than 70 languages, detects them automatically, and keeps the speaker's intonation, pacing and pitch.
How fast does it translate?
It streams the translation while the person is still speaking and stays only a few seconds behind. This avoids the turn-by-turn pauses of older tools that wait for full sentences.
Where can I use it?
Developers reach it through the Gemini Live API in Google AI Studio, business users get it in Google Meet, and consumers use it in the Google Translate app on Android and iOS in a global rollout.
How many languages does Google Meet support now?
Meet expands from 5 languages to more than 70, which opens over 2,000 language combinations in a single meeting.
Is the translated audio marked as AI-generated?
Yes. All audio the model produces carries a SynthID watermark, Google's tool for identifying AI-generated content later.
Gemini 3.5 Live Translate moves Google toward live conversation where language is no longer the barrier, handling many languages in one continuous flow across developer, business and consumer tools. Early tests with partners like Grab will show how well it holds up at scale.
Continue Learning
Comments
Sign in to join the conversation