Voice Bots or Conversational AI (including chatbots and messengers) is a transformational technology incorporating artificial intelligence.
For those of us who are frustrated at the voice bots that don’t recognize our accent or pronunciation and keep repeating, and for those who are not fans of holding for customer service for 30 minutes, voice bots are exciting as well as a challenging technology.
Most of us are already interacting with voice bots in many situations – Siri, Alexa, and Google Assistant – to varying degrees of success.
For business and technology leaders in the enterprise, it is a strategic imperative.
Table of Contents:
- Voice Bots (or Voice Technology) Definition
- Voice and Speech Recognition Technologies: A Trip Down the Memory Lane
- Why Enterprises Need Voice Bot Technology?
- Top Enterprise Use Cases for Voice Bot Technology
- How do Voice Bots Technologies work?
- Top Features and Capabilities of Voice Bots Technology
- Adopting and Expanding Voice Technology in the Enterprise: Best Practices
- Pitfalls and Shortcomings of Voice Technology
- Evaluating Potential Voice Bot Platform Vendors
Definition of Voice Bots (or Voice Technology:
Let’s begin by asking the fundamental question, “What is a Voice Bot or AI-powered Voice Technology?”
Voice Bots (or Voicebots) are an emerging technology that leverage natural language processing (NLP) and natural language understanding (NLU) to understand a customer (or employee or other stakeholders) request, frame it in proper context, and provide an engaging response, with the ability to resolve the issue or escalate to a human agent for further support.
The voice bot technology includes ingesting, decomposing, analyzing, synthesizing, interpreting all the relevant data and information, both structured and unstructured, and formulate meaningful responses to questions that may be phrased differently.
Or one can call a voice bot: “A Voice Bot is an AI system that can engage in meaningful and intelligible two-way conversation using a voice that harnesses underlying data and information.”
Voice and Speech Recognition Technologies: A Trip Down the Memory Lane
While voice technology has taken a giant leap in recent years, the technical underpinnings and concepts are not new. It goes all the way back to Shoebox, a voice recognition system by IBM. That rudimentary and yet exciting technology for the times spoke all of sixteen words.
In the 1970s, DARPA funds five years of speech recognition research, and the program led to the creation of the Harpy by Carnegie Mellon, a machine capable of understanding 1,011 words.
Dragon Dictate by Dragon Systems (now Nuance) in the 1990s was a fully-featured speech recognition product.
And Siri, Cortana, Alexa, and Google Assistant are the latest and popular incarnations of voice and speech recognition technologies.
The following Infographic details the history of Voice and Speech recognition and today’s voice bots.
Why Enterprises Need Voice Bot Technology?
The need for voice bots and conversational AI is partly due to the following macro and micro trends.
- The ubiquitous 24/7 world of digital interactions and transactions
- The rising customer expectations about what constitutes good service
- The culture of impatience and the need for instantaneous answers
- The risk of negative word of mouth on social media at the drop of a hat
- The expensive proposition of building and resourcing call centers
- Keeping up with the competition thus making voice bot technology adoption as table stakes
Top Enterprise Use Cases for Voice Bot Technology:
While far from fully mature, the voice bot and conversational AI technology have come a long way in recent years, thanks partly to the ability to crunch vast data, both structured and unstructured.
Launching voice bots is not a vanity project anymore, and AI-powered voice technology is enabling several enterprise use cases, including:
Enterprise Use Cases for Voice Bots and Conversational AI:
- Voice bots and conversational AI can power 24/7 self-serve customer service across channels.
- Voice technology can facilitate setting up appointments and gathering intake information.
- Voice bots can engage with prospects by providing primary and personalized information and capture lead information.
- Annoying as it may be or recipients, voice technology is helpful in outgoing top-of-the-funnel sales calls.
- Voice bots can call reminding about appointments, meetings, and demos, thus increasing the show rate.
- Voice bots can help interact with other voice bots (or humans) and help place orders to smoothen the supply chain interactions.
- Voice technology can help customers in paying a bill
- Soon, one voice assistant will call another voice assistant to schedule meetings between participants
- Voice technology can interact with other humans or another conversational AI and make reservations.
- Voice technology can also inform guests about specials and take orders in restaurants.
- Voice bots can help customers navigate the vast landscape of big box stores.
- Well-crafted voice bots can help customers in placing an order, rescheduling delivery, and allowing them to return merchandise
- Upselling and Cross-Selling
- Form filling assistance
- Intelligent Assistant to Contact Center Agents to supplement and complement their efficacy
How do Voice Bots Technologies work?
The following are the building blocks of a voice bot technology platform.
Top Features and Capabilities of Voice Bots Technology:
Voice technology is an emerging standard, and each day, voice bot platform vendors keep adding new features. However, for an enterprise interested in leveraging voice bots for various business situations, here are the top features, functionalities, and capabilities to consider:
Voice Bot Technology Feature List:
- Customizable flows
- A hybrid platform that works across channels – voice, chat, text, and web, social media, phone, and video
- The seamless hand-off from one channel to another and from one media to another with history intact.
- Context Awareness to manage complex conversations
- Multi-language support with the ability to switch languages and maintain a conversational flow
- Accent Processing and Understanding
- Intent Analysis
- Smart routing
- Cutover to a Human Agent when necessary
- NLP (Natural Language Processing), NLU (Natural Language Understanding), and NLG (Natural Language Generation) Capabilities with configurable models.
- Supervised and unsupervised and deep learning methods, including structured, semi-structured, and unstructured data.
- Self-Learning – Ability to learn from previous interactions and add to the knowledgebase
- Voice biometrics – ability to identify a voice and link it to a known profile.
- Speech to Text
- Text to Speech
- Localization not just about language but pricing, product variations, and other aspects
- Ability to include and learn from humans in the loop
- A set of core integrations
- Integration APIs and Webhooks
- Configurability
- Scalability
- Extensible Architecture
- Privacy Features
- Security Features
- Testing and Tuning
- Analytics and Reporting
Adopting and Expanding Voice Technology in the Enterprise: Best Practices
Business technology leaders of enterprises should consider various aspects, including the risks and potential shortcomings, and adopt best practices while implementing voice bot technologies.
- Choose the First Use Case Carefully: While voice bot technology is exciting and the possibilities are endless, choose a use case that is simple enough and provides sufficient visibility and win for the team. Overly complex use cases and harnessing every bell and whistle may not be an excellent idea for your first use case.
- Build with the User in Mind: What does the user want? What is the optimal outcome they are striving for, and how to get there with the minimum set of steps and interruptions? What are their personalities, preferences, and oddities? What types of languages, accents, and tone of voice come into play? (say an angry customer filing a complaint versus a happy one ordering an expanded configuration.)
- Configure to Facilitate Conversation: The dialog between two humans flows naturally and not according to a rigid set of “If” “Then” routes. That is where AI-based Voice Bots are a quantum leap—leverage AI’s ability to converse with context to facilitate conversations.
- Design with Experience in mind: Think about yourself in the shoes of your stakeholders and the frustration of rigidity and artificiality of the Experience when dealing with voice bots. Try to understand their journey and design to smoothen the Experience. Your sole goal cannot be cost minimization by avoiding human contact.
- Think Multi-Modal, Multi-Channel, and Multi-Media – Supplement voice with visuals or a supporting text to guide users to a form, switch from voice to chat, or vice versa.
- Maintain Consistency and Coherence, not rigidity – Allow conversations to be intelligible and easy flowing and not constrict the interaction to a rigid step-by-step process flow.
Pitfalls and Shortcomings of Voice Technology:
- Missing Situational Context: If the voice bot cannot truly fathom the implicit context and display situational awareness, the customer will be frustrated.
- Accents, Idioms, and Cultural Expressions: Language is a complex art, and communication is challenging with different languages, accents, dialects, regionalization of idioms and expressions, and cultural meaning of phrases and words. It is an arduous task to manage through all this complexity.
- Privacy Concerns: Customers fear that voice bots record all interactions and are concerned about cybersecurity and violations of personal privacy.
- Disconnected Interactions: Voice along sometimes may not be sufficient. And suppose the voice bot is treating each conversation as a distinct and decisive interaction rather than follow-through from the previous set of interactions. In that case, it may be lead to a disjointed experience.
- Reliance on Devices and Device Makers: Whether you are using smartphones or vehicles or Smart Speakers, there is a reliance on an external third-party, which may lead to limitations, complications, and device/OS diversity that will make it hard to succeed with voice technology.
Evaluating Potential Voice Bot Platform Vendors:
Of course, in addition to the features and functions outlined before in the article, enterprise buyers should consider the following:
- Robustness of the NLP, NLU models
- What use cases does the voice bot technology platform support?
- What is the Experience within the sector or the industry?
- Voice bot technology vendor product roadmap?
- How open is the platform, and whether there is a proprietary vendor lock-in?
- Are the voice technology components standards-based?
- Cost and total cost of ownership
- Deployment models
- Training and Transition Burden
- Cost, particularly the total cost of ownership.