Why skimping on speech AI technology could cost banks billions
For years, billions in venture capital has poured into fintech banks like Chime and N26 on the bet such upstarts can wrest away the lion’s share of an estimated $469 trillion in assets held globally by other financial institutions and retail banks.
Banks have held their own through the pandemic, reporting record 2021 profits on low chargeoff rates, rising customer deposits and thriving investment opportunities. Yet a new survey of 142 banking executives around the world, conducted by Capgemini and Qorus for the World Retail Banking Report 2022, found that 70% of them believe they lack foundational data analysis and AI capabilities to compete long term.
What’s the biggest concern? Customer experience. The technology empowering decentralised finance – where consumers bank when and where they want – is now augmented with a more sophisticated, AI-driven banking experience. Mobile apps enable more than just bill pay as AI-infused virtual assistants alert customers to potential fraudulent activity or transfer money via voice commands.
While fintechs and technology players like Apple and Google are creating fast, easy-to-use systems for customer interactions, incumbent banks have outdated legacy systems that make it more difficult to leverage the mountains of personal, financial and even social data they’ve amassed for each customer.
What’s more, many are missing the foundational voice assistant technology consumers are embracing in droves. Some 50% of 8,000 banking customers surveyed in the aforementioned Capgemini report cited voice assistants as a feature they want to see most, yet only 35% of bank executives saw it as a priority.
Context-aware speech AI
And even for those who are adopting automatic speech recognition, text-to-speech and natural language processing, choosing the right technology is key to everything that follows on the road to continued and growing customer loyalty.
AI helps call centre representatives provide better answers and solutions by using virtual assistants and chatbots in the initial phases of a call to understand the issue and even resolve it entirely. UK-based NatWest recently reported that Cora – the bank’s conversational AI-based virtual assistant – is handling 58% more inquiries year on year and is completing 40% of those interactions without human intervention.
Following the money
Digital resolution of customers’ inquiries drive significant cost savings to banks, who are expected to save $7.3 billion by 2023 through the use of virtual assistants, according to a recent Juniper Research study.
Banks focused solely on those cost savings typically try to make do with speech AI software that recognises about 80% of the words spoken by a customer. The reason: they don’t have the developer resources to customise chatbot software to understand words or phrases unique to the industry.
Employing that tactic, however, goes to the core of whether a customer considers each interaction helpful or unhelpful. In competition with fintechs, automated speech recognition and text-to-speech technology must be industry and even company-specific.
The innovation game
To do speech right, it starts with automatic speech recognition. Without getting accuracy above 85%, the downstream services that use speech AI as a foundation won’t drive the expected business outcomes or deliver the impact one expects.
Some of these include sentiment analysis, hyper-personalisation and even regulatory record-keeping. By working with speech recognition software that already has thousands of pre-trained models, banks can scale quickly simply by tailoring further training to their specific needs. Then, they can deliver the same experience anywhere – on-premise, in the cloud and hybrid.
Banks are still learning the ins and outs of platform innovation. Without a strong foundation in automated speech recognition and text-to-speech technology, creating and promoting new financial products, maintaining customer relationships and innovating through partnerships are shaky propositions at best.