The report pointed out that command language, speech recognition, and accuracy issues are barriers to voice adoption
The UXS Technology Planning Report: Artificial Intelligence (AI) revealed that voice as the primary human-machine interface (HMI) for AI provides more natural engagement through discussion and consulting. This recent study from Strategy Analytics, on AI, investigates the needs, behaviors, and expectations of future AI consumers.
Media measurement and analytics company, comScore stated that by 2020, around 50% of all online searches will be voice searches. Voice enabled AI solutions are changing the landscape of smart homes and IoT technologies. The use of voice search and voice messages on Facebook Messenger, mobile messaging apps, Google mobile apps has been on the rise.
The vast number of websites generate huge amounts of data. This data helps brands create more personalized content for consumers. Through AI, machine learning and voice search advancements, brands can provide quick answers to customer demands. Google has created algorithms to cater to a customer’s search intent behind queries and answer them correctly.
An increase in the number of voice search users indicates that voice will soon be dominant as the primary HMI. A MindMeld survey shows that 60% of smartphone users who used voice search had begun using it within the past year, with 41% of respondents only using voice search in the preceding 6 months.
Brands need to be proactive in integrating voice recognition into their products and services to stay ahead of the competition. Google is focusing heavily on voice search and natural language processing going forward when you consider that in 2015 alone, voice search rose from a “statistical zero” to make up 10% of all searches globally.
The use of AI smart home and IoT applications is increasing exponentially. Consumers associate AI with digital assistants, entertainment recommendations, and self-driving cars. A Rockfish survey revealed that consumers prefer AI with personality. However, there are many loopholes that need to be addressed to make AI, the operating system of a consumer’s life.
Barriers to Voice Adoption
The study showed command language, speech recognition, and accuracy issues as the barriers to voice adoption. Users need a voice recognition software that works accurately in any language and perform well even in noisy conditions. Fluent.ai, an AI-powered voice-enabled user interface, does more than that. It understands users who speak complex languages, and their associated accents, and also those that have voice impairments. It understands users and then improves over time with accuracy approaching 100%. Fluent.ai offers Original Equipment Manufacturers (OEM) and user interface designers, a speech interface solution for their products and services.
Although voice commands provide the most natural HMI, Christopher Dodge, Associate Director of Strategic Analytics and the report’s author pointed out the need of manual options in some instances to override the intelligent aspects of a device. “Anxiety around technology breaking down and rendering the user helpless also exists, especially if it is relied on for everyday tasks. This becomes a concern in the event of a power failure, device battery failure, or inability to communicate effectively at any given time,” Dodge explained.
Chris Schreiner, Director of Syndicated Research, User Experience Improvement Program (UXIP), has a solution. He said, “OEMs need to provide a quick method for users to intervene and regain control. The AI solution needs to be intelligent enough to identify that a change has been requested and adjust accordingly. Knowing how to react when the user changes their mind or alters their path is crucial when AI gets something wrong.”
Fragmentation causes inconsistent user experiences
The study also found that fragmentation causes inconsistent user experiences because not all features and functions are the same across all AI devices and platforms. Moreover, not all AI devices and platforms have the ability to ‘speak’ to each other.
Dodge informed that fragmentation results in users having to devote extensive time to manage different devices and profiles. The consumers get frustrated in trying to remember different command language. This is mainly due to the differing capabilities of the phone assistants and at-home assistance.
Command language and scope of supported features have significant differences across AI platforms. If one of the most compelling features of AI is to minimize user input, learning the differing abilities and command languages of many AI devices is time-consuming and mostly inefficient. For instance, the command to activate the voice-enabled solutions powered byAmazon Alexa and Google Assistant is not the same.
Opening Access to Speech Recognition API to Increase Utility
Schreiner revealed that consumers want the same level of AI to be embedded in all the devices. This way, AI could function much faster; and if the internet is down or slows, usage is not impacted.
Microsoft Cortana, Google Assistant, and Amazon Alexa have opened their speech recognition APIs to third party players, who can use the same software to power their devices. The speech recognition software gets better as more and more third-party products (and their data sets) connect with them.
Amazon has launched a platform called Amazon Lex for making its voice-control technology available to all, giving developers access to the same tools that power its digital assistant Alexa. Amazon Lex bundles in deep learning-powered automatic speech recognition and natural-language understanding.
Before this, Alexa 7-Mic Far-Field Development Kit was made available to third-parties for developing improved Alexa devices. The kit gives Amazon’s partners access to the same circular 7-microphone array that’s used on the Amazon Echo. Amazon Alexa is a voice service that provides capabilities, or skills, that enable customers to interact with devices in a more intuitive way.
Enhanced AI experience with advanced NLP and user profiling
The study suggests that more intelligent voice implementations, supported by advanced user profiling, continued learning and natural language processing, will enhance the AI experience and make it seem more ‘intelligent’ and life-like. However, many users are unsure or simply unaware of the specific commands they need to activate voice-enabled AI solutions.
Google Cloud Speech API enables developers to convert audio to text by applying powerful neural network models in an easy-to-use API. The Microsoft Speech Platform supports 26 languages, whereas Google speech recognition supports 80 languages.
Devices within an ecosystem should talk to each other
For Example: If a consumer does not have all devices with Amazon Alexa built-in, they run the risk of one device not communicating with another,or would have multiple voice assistants talking to them, instead of the smart devices talking to each other.
The study concludes that all systems within a voice assistant ecosystem should be able to communicate with each other, to ensure seamless integration, regardless of brand.