DeepL Voice preferred by 96% of professional linguists, outpacing leading competitors in spoken translation speed and accuracy

New study reveals DeepL Voice delivers higher-quality translations and more stable live captions in meetings across 14 language combinations, compared to Google Meet, Microsoft Teams and Zoom

DeepL, a global AI product and research company, reveals results from an independent benchmark study evaluating the quality and stability of real-time AI-translation and captions for meetings across widely used collaboration platforms; Google Meet, Microsoft Teams and Zoom.

As real-time multilingual meetings become routine for global teams, enterprises increasingly rely on AI voice technology to support collaboration, customer negotiations and cross-border decision-making.

Commissioned by DeepL and carried out independently by Slator – the leading source of research and market intelligence for translation, localization, interpreting, and language AI – the study finds that DeepL Voice achieves the highest scores for both translation quality and caption stability, outperforming built-in caption translation tools available in Google Meet, Microsoft Teams and Zoom.

In high-stakes settings – from sales discussions and supplier negotiations to internal strategy meetings – even minor translation errors or unstable captions can significantly slow momentum, create confusion or undermine critical business decisions. As global business accelerates, AI-driven translation is no longer a convenience feature, it has become essential infrastructure. The strength of that infrastructure depends on two critical factors: the quality of AI translation and the stability of live captions.

Marketing Technology News: MarTech Interview with Stephen Howard-Sarin, MD of Retail Media, Americas @ Criteo

Key findings include:

DeepL Voice leads translation quality in human evaluation, achieving a 96.4/100 quality score for DeepL Voice for Zoom and 96.3/100 for DeepL Voice for Teams, compared with 87–89 across other evaluated platforms.
DeepL Voice significantly reduces high-severity errors, lowering the rate of critical or major translation errors by 76% on average versus other platforms.
DeepL Voice produces fully passing translated segments 79% of the time, compared with 42% across competing tools.
DeepL Voice improves caption stability, earning stability scores of 88.6/100 (DeepL Voice for Zoom) and 85.8/100 (DeepL Voice for Teams) in Slator’s automated frame-level analysis.
DeepL Voice reduces caption churn (on-screen rewrites/flicker) by 37.6% on average versus Microsoft Teams and 54.7% on average versus Zoom.
Across all blind evaluations, DeepL Voice is preferred by 96% of professional linguists benchmarking the solutions.

Marketing Technology News: From MarTech Stack to MarTech Fabric: Weaving Brand, Content, and Conversion Into One Thread

“Language AI is becoming the core infrastructure for how global businesses operate. In that context, accuracy and stability aren’t features, they’re requirements,” said Jarek Kutylowski, CEO of DeepL. “This independent benchmark shows DeepL Voice delivering on both and at a level that sets a new standard for real-time communication. When professional linguists overwhelmingly prefer one solution, it’s a clear signal of where the market is heading.”

“We looked beyond basic accuracy scores and focused on the ‘human’ side of AI captioning: readability, fluency, and stability. We didn’t just want to know if the words were right at the end; we wanted to see how they behaved while a person was trying to read them. That meant evaluating both linguistic quality and how captions behave on screen and that’s where we can clearly see a new benchmark being set by DeepL,” said Alex Edwards, Head of Consulting, Slator.

Why stability matters as much as accuracy
Slator’s report finds that even when translations are broadly accurate, frequent caption rewrites can interrupt comprehension and reduce usability in real-world meetings. To reflect how captions appear to end users, Slator measured stability using frame-by-frame analysis of rendered captions on screen, capturing flicker, oscillation and rewriting behavior over time.

Recently Published