Baidu has announced the launch of its next generation neural text-to-speech technology. The new technology – Deep Voice 2 is an extension of its marquee production-quality neural networks within AI system. Earlier in February 2017, the global search engine platform announced Deep Voice 1 pipeline as one of the most advanced audio synthesis platforms that deliver a higher performance and significantly improved speech quality compared to existing technologies.
By announcing Deep Voice 2 iteration to its TTS pipelines, Baidu intends to deliver speech analytics to marketers as the best way to identify customers across multiple touch points.
Deep Voice 2 Learns from Recurrent Speech Analytics
In February 2017, Baidu’s Silicon Valley AI Lab released Deep Voice 1 system. The neural network is capable of reproducing human voices in real-time, synthesizing audio using advanced neural text-to-speech (TTS) systems. TTS (artificial speech synthesis) by Baidu learns from recurrent speech analytics and input augmentation. This makes neural speech synthesis pipelines scalable to multi-speaker TTS models without requiring any speech training module.
Deep Voice 2, as an iteration of the TTS system from Baidu, learns from hundreds of unique human voices and perfectly imitates them. Unlike traditional systems, the new TTS system learns hundreds of the voices per speaker in less than 30 minutes. The audio rendition of different voices is perfect, generating speech quality that imitates the target speaker accurately. Deep Voice 2 learns these qualities from scratch, without any guidance about what makes voices distinguishable.
MarTech’s Future Powered By AI-Based Synthetic Speech
For marketers, Deep Voice 2 could minimize the entire customer service and feedback management in just one click. By training deep neural networks with interactive media assets, data analysts can enable customized automated systems to learn from customer data and Voice of Customer (VoC).
The increasing involvement of digital assistants in marketing campaigns and customer engagement clearly indicate where data analytics is headed. Marketers can now use smarter AI-based technologies to improve their stacks with enhanced synthesized communication. Including Machine learning capabilities into every touch point of the customer journey helps the brand create more human-like interaction, removing awkward conversations and easier customization. It is expected that Deep Voice 2 in intelligent assistants running on AI could be leveraged as a full-blown marketing ammunition for speech analytics.