The Importance of AI Voice Ethics

The Importance of AI Voice Ethics

There are few scenarios where people prefer to listen to an AI voice, as opposed to a real one.

This might be a counterintuitive statement from an AI voice startup founder. However, research proves that humans comprehend natural speech in such a way that it will be difficult for technology to ever replicate. We’ve evolved to become familiar with reassuring nuances in speech which let us know we’re safe and can trust the speaker.

However, over the past decade, progress in synthetic speech has enticed many companies to adopt AI voices, a trend accelerated by a digital audio renaissance.

In late 2020, Media24, South Africa’s largest news platform, launched the world’s first custom AI audio narrator across their digital subscription package, which boosted audience engagement across all age groups.

The paradigm is shifting for synthetic speech

In many cases, AI voice technology is economical and practical.

The cost of converting written content via text-to-speech APIs is incomparable to using a voice actor. Personalized content, real-time updates, and large-volume publishing can now be audiofied — implausible using a voice actor.

 The advantages of synthetic speech are beginning to outweigh the drawbacks as voice technology improves.

AI voices have already come far in recent years, breaking from the confines of Interactive Voice Response (IVR) systems, personal assistants, and screen reader technology. Synthetic speech is now being used in audiobooks, gaming, news, and film making.

Marketing Technology News: MarTech Interview with Eric Vermillion, CEO at Helpshift

Innovation requires collaboration

As AI voice adoption progresses, two problems arise.

​​Firstly, voice actors face job displacement. This is inevitable to some extent, as with all automation.

The second problem is more opaque: a stalemate between voice actors and AI companies will stifle innovation, diversity, and fairness of voice technology.

Recent high-profile news stories have the voice actor industry sufficiently spooked by the threat of synthetic speech. Companies have taken advantage of voice actors by having them enter into contracts for perpetual rights over their voice IP.

In September 2021, TikTok settled a suit with voice actor Bev Standing after the company installed text-to-speech functionality based on her voice. The voice actor had entered into a contract with a different company several years earlier that transferred her voice IP to TikTok without her consent.

Voice actors will no longer enter into contracts without fair control over their voice IP — a crucial input for the advancement of AI voice modelling. Unless voice IP licensing standards are adopted, we’ll end up living in a world of robotic, generic, and soulless AI voices.

Imagine a world where every newsreader, narrator, and voice over sounded like your Alexa. No thanks! The extent to which voice actors are marginalized by AI correlates with this path towards voice tech dystopia.

A new voice economy

However, we predict a different future — and we’re working towards it.

Automation of some audio is inevitable (and our business relies on it) but by no means will voice actors be replaced by AI voices. If businesses can afford to use a voice actor and the recording process doesn’t impede on production, using a natural voice is still the obvious choice.

There will be automation of many use cases: news articles, corporate communications, educational content — those scenarios where volume and efficiency are priorities. And as AI voice modelling continues to improve and technology progresses, more opportunities will emerge. The metaverse, for example, is a format (if you could call it that) where AI voices will flourish.

However, the future success of voice depends on building fair business models that compensate voice actors for their IP. In order to build a vibrant, dynamic, and diverse future for voice, voice actors will need to be included and compensated. Technology cannot replicate human naturalness without humans.

Progress is being made. The Open Voice Network (OVON), a subsidiary of the Linux Foundation, is pushing forward open standards in voice licensing to help protect voice actors and build trust in AI voices. Last month, BeyondWords and the OVON launched a first-of-its -kind voice cloning contract, a template to help voice actors protect their IP when working in synthetic speech — and secure a fair rate.

This is the beginning of a new era for voice technology, an era where voice actors are fairly compensated for their expertise, creating a new voice economy, and, with hope, some pretty awe-inspiring technology.

Marketing Technology News: MarTech Interview with Justin McDonald, Senior VP and GM of Conversational Marketing at Terminus

 

Picture of James MacLeod

James MacLeod

James MacLeod is Co-Founder and COO at BeyondWords

You Might Also Like