spot_imgspot_img

Recently Published

spot_img

Related Posts

Deepgram Launches Flux Multilingual: The World’s First Multilingual Conversational Speech Recognition Model

Deepgram Logo

One model, ten languages, and monolingual-grade accuracy for voice agents worldwide

Deepgram, the real-time AI infrastructure company underpinning the Voice AI economy, announced the general availability (GA) of Flux Multilingual, expanding its conversational speech recognition model beyond English to support 10 languages, with the ability to automatically detect, understand, and switch languages dynamically within a single conversation in real time. Developers, enterprises, and product teams building voice agents now have access to the first real-time conversational speech recognition model, delivering accurate turn-taking, interruption handling, low latency, and natural human-like conversations at global scale.

“Voice AI agents will soon become the default for how global enterprises interact with customers,” said Scott Stephenson, CEO and Co-Founder, Deepgram. “Today is a major step forward towards that future…”

Traditional automatic speech recognition (ASR) is designed for transcription. Flux introduced a new approach, conversational speech recognition (CSR), built from the ground up to understand dialogue flow and enable real-time interaction. Flux has rapidly become foundational infrastructure for real-time voice agents, powering production systems that developers trust to deliver fast, natural conversational experiences with best-in-class accuracy in turn detection and speech recognition. Prior to today’s release, extending these experiences across multiple languages required stitching together multilingual transcription models, language detection, and routing logic, introducing latency, complexity, and brittle user experiences. Flux Multilingual replaces that complexity with a single model and API, making it possible to build conversational voice agents across 10 languages without re-architecting systems or sacrificing performance.

Marketing Technology News: MarTech Interview with Stephen Howard-Sarin, MD of Retail Media, Americas @ Criteo

With native support for turn-taking, interruptions, and code-switching within a single interaction, voice applications remain fluid, responsive, and natural regardless of language or region. Flux Multilingual delivers monolingual-grade accuracy across languages. Developers can guide the model with language hints or let it auto-detect, adapting in real time even mid-conversation.

“Voice AI agents will soon become the default for how global enterprises interact with customers,” said Scott Stephenson, CEO and Co-Founder, Deepgram. “Today is a major step forward towards that future. Flux Multilingual gives developers a single perception model to build global voice agents, with the ability to switch language mid-call. Now, enterprises can deliver the same seamless experience to any customer, in any market. Deepgram is the leader in real-time AI infrastructure, and Flux Multilingual is the latest in our suite of capabilities that enables developers to deliver real-time products across the globe.”

“Customers told us that Flux transformed what’s possible for real-time voice AI agents in English,” said Omar Paul, Vice President of Products, Twilio. “It stood to reason that Deepgram would solve this globally too. Our customers’ teams no longer need to sacrifice accuracy with legacy multilingual systems, nor stitch multiple models with complex routing themselves. With Flux Multilingual, teams take the exact conversational experience they built for English and extend it across languages with a single system.”

Marketing Technology News: From MarTech Stack to MarTech Fabric: Weaving Brand, Content, and Conversion Into One Thread

Flux Multilingual Capabilities

Supported Languages
English, Spanish, French, German, Hindi, Russian, Portuguese, Japanese, Italian, and Dutch

Ultra-low latency conversational speech recognition, now global
Flux Multilingual is built for understanding and interaction, not just transcription. It uses model-based turn detection, not simple silence detection, to deliver accurate end-of-turn decisions in under 400 milliseconds, keeping conversations fluid and responsive across languages.

Monolingual-grade accuracy with real-time language control
Flux Multilingual delivers monolingual-grade accuracy across languages, with flexible real-time control through language hints or automatic detection, native code-switching, and dynamic adaptation as conversations evolve.

Build and scale global voice agents with one model
Flux Multilingual supports 10 languages in a single conversational model, enabling teams to build and deploy voice agents globally with one integration. One model, ten languages, one API, with no additional infrastructure or model orchestration required.

Key Features

  • Native turn detection and interruption handling for natural dialogue flow
  • Low-latency streaming transcription for real-time responsiveness
  • Automatic language detection and language hint support for accuracy control
  • Mid-session configurability for dynamic language adaptation
  • Native code-switching within a single conversation
  • Fully compatible with existing Flux API integrations

Flux Multilingual is now generally available (GA). As part of the launch, Deepgram is offering a limited-time promotional rate on streaming speech-to-text, including Flux Multilingual and Nova-3 models.

Flux Multilingual is available via Deepgram’s Cloud API or as a self-hosted deployment, with support for EU endpoints, SDKs, and seamless integration into voice agent architectures. Developers can get started today at deepgram.com or try Flux Multilingual directly in the Deepgram Playground.

Write in to psen@itechseries.com to learn more about our exclusive editorial packages and programs.

Business Wirehttps://www.businesswire.com/
For more than 50 years, Business Wire has been the global leader in press release distribution and regulatory disclosure.

Popular Articles