Microsoft Previews Advances in Conversational AI Coming to Business Apps
Microsoft readies new tools that advance conversational AI in Office, Teams and more.
May 10, 2019
Microsoft this week previewed a new “conversational engine” that aims to raise the bar in the use of speech recognition by enabling more natural voice interactions with devices, PCs and apps.
The company plans to integrate the new Azure Speech Service, showcased for the first time at its annual Build conference in Seattle, into its Cortana speech interface and Bot Framework, a set of APIs and tools that are designed to let developers enable commercial-grade, voice-enabled AI. Microsoft said the new natural language and intelligent-agent capabilities will also become a core capability in the Microsoft 365 platform, including Office, Teams and other Azure-based services.
Applications and services that support the new conversational AI and virtual agent capabilities will let people speak to their mobile device or PC more naturally and receive more accurate responses or actions. In a video presented at Build (see below), Microsoft showcased an executive organizing her day by having a dynamic conversation with the Cortana virtual agent.
While the Azure Speech Service doesn’t portend to solve all of the problems faced with bringing conversational computing to parity with human conversations, it appears to build on the limitations that now exist. At Build, Microsoft’s flagship technical gathering of software developers, company officials revealed and demonstrated the new, conversational AI capabilities.
Microsoft said the new technology introduces the ability to recognize and translate more diverse forms of speech and dialect, adapt to multiple voices in a conversation and calibrate for local and vertical terminology to create more accurate transcripts and responses to spoken commands.
Azure Speech Service moves beyond the common approach of using virtual agents that are limited to restricted command sets that are programmed manually with skill sets or intents and then mapped to a proper action connected to a back-end system, according to Microsoft.
The resulting service, now in preview, also promises to offer new opportunities for partners by giving ISVs and systems integrators who use Microsoft’s Bot Framework, the opportunity to extend speech recognition and virtual agent capabilities into their solutions that are more commercially viable, repeatable and customizable.
Since becoming CEO of Microsoft five years ago, Satya Nadella has made commercializing the company’s years of research in AI and natural language understanding a key priority. While Microsoft Research has invested extensively in natural language processing via its Speech and Dialog Research Group, the company’s acquisition of Semantic Machines last year has helped extend that effort with a new approach to conversational AI.
Led by prominent natural language AI researchers Dan Klein, a professor at U.C. Berkley, Stanford University professor Percy Liang, and former Apple chief speech scientist Larry Gillick, Semantic Machines moves machine learning beyond the traditional development approach of programming intents to the entire process.
“Instead of a programmer trying to write a skill that plans for every context, the Semantic Machines system …
… learns the functionality for itself from data,” according to an explanation in a Microsoft blog. “In other words, the Semantic Machines technology learns how to map people’s words to the computational steps needed to carry out requested tasks.”
In his keynote address at Build, Microsoft CEO Nadella said that Semantic Machines acquisition has helped accelerate the new capabilities introduced this week.
“The most interesting thing is when you combine speech recognition with language models that are specific to your organizational data,” Nadella told attendees. “Imagine a transcript that gets created that has the ability to understand the local jargon that’s specific to your organization, your industry, that way making the transcript that much more useful.”
Among the new deliverables are a category called Decision, a service that provides specific recommendations, and Personalizer, which lets developers embed personalization capabilities that apply a reinforcement-based learning model and recommendations engine that will let individuals use AI to render optimized business decisions and recommendations across different apps. At this week’s Build conference ,Microsoft also introduced its new Conversation Transcription service that transcribes conversations in real time.
Industry analyst Patrick Moorhead, founder and CEO of Moor Technology and Insights, said Microsoft introduced some unique advances in conversational computing.
Moor Insights’ Patrick Moorhead
“I see Microsoft as the leader in voice processing and recognition,” Moorhead said, adding that Microsoft has focused its efforts squarely on the commercial and enterprise market, and largely has ceded making a splash with consumers to Amazon and Google.
“Right now, Amazon and Google are primarily focused on consumer applications,” Moorhead said. “And Microsoft is primarily focused on enterprise applications. Microsoft is doing the best at it right now at work-type of conversations, because they have the best data right now. And the consumer market really hasn’t been a focus. Google and Amazon are putting their toe in the water in the business market. While Amazon has Alexa for Business, it’s largely for consumer-facing use, such as in hotels and conference rooms.”
About the Author
You May Also Like