Voice-based intelligent platforms have become quite popular over the years and tech giants are adopting them as a way to offer consumer-centric services. Whether it is Apple’s Siri, Google’s Assistant or Microsoft’s Cortana, the list is growing at a significant pace. While these platforms use artificial intelligence to a great extent, a perfect conversational AI is yet to be achieved.
These trending bots and intelligent assistants only follow simple commands and queries based on their training. As Microsoft’s blog notes, for rich and effective communication, intelligent assistants need to be able to have a natural dialogue, which is lacking significantly. To accomplish a truly intelligent bot, Microsoft recently acquired conversational AI startup, Semantic Machines.
Semantic Machines: Background
This California-based company has been known to aggressively develop fundamental technology which allows humans to interact naturally with computers. Led by tech entrepreneur Dan Roth, UC-Berkeley professor Dan Klein and Stanford University professor Percy Liang, this recent acquisition would give Microsoft access to its formidable talent. A lot of current staffers from Semantic Machines have been ex-employees at Nuance, the voice recognition company that once powered Siri — which says a lot about their experience in the field.
Founded in 2014 by Dan Roth, his earlier venture called Voice Signal Technologies was acquired by Nuance in 2007. This strong background explains why Microsoft chose Semantic to strengthen its voice-based AI portfolio.
What Does Microsoft Want?
With a strong background in conversational AI, Microsoft plans to tap into Semantic Machines’ Conversation Engine, which extracts semantic intent from natural input such as voice or text. It then generates a self-updating learning framework that manages dialogue context, state, salience and the goals of the end users.
Also, the Natural Language Generation technology of the startup formulates communication with the user, based on a dialogue context. It relies on machine learning to enable users to discover, access and interact with the information and services in a much more natural way. The idea is to create a conversational system which is contextual, unlike most of the dialogue technologies today, which are mostly orthogonal.
Semantic Machines claim that their AI can produce conversations which not only answer queries but also predict questions more effectively. They also claim that their AI will have a natural flow in the conversation which has not yet been fully achieved by the likes of Siri, Google Assistant, Alexa or Microsoft’s own Cortana.
Natural Dialogue In Conversational AI
Semantic Machines’ solution, which is mostly focused on enterprise customers, would not only bolster Cortana but other social chatbots like Xiaolce, which has recorded up to 30 billion conversations across the Asian countries till now. It will improve capabilities in Microsoft Cognitive Services and Azure Bot Service.
“XiaoIce has had more than 30 billion conversations, averaging up to 30 minutes each, with 200 million users. With XiaoIce and Cortana, we’ve made breakthroughs in speech recognition and more recently become the first to add full-duplex voice sense to a conversational AI system, allowing people to carry on a conversation naturally,” said David Ku, CVP and chief technology officer of Microsoft AI and Research.
The company also plans to establish a Conversational AI Center of Excellence in Berkeley to further boost the developments in this area. “By combining Semantic Machines’ technology with Microsoft’s own AI advances, we aim to deliver powerful, natural and more productive user experiences that will take conversational computing to a new level. We’re excited to bring the Semantic Machines team and their technology to Microsoft,” he added.
Other Initiatives By Microsoft In AI
Microsoft has been driving research and breakthroughs in the building blocks of conversational AI over the years. With its continued efforts in speech recognition and natural language understanding, they are inching towards achieving a phase where chatbots could talk and understand like humans. In 2016, the company took a step in conversational computing by introducing a framework for developing bots and pre-built Cognitive Services for infusing speech recognition and natural language understanding into intelligent assistants.
Earlier this year in April, Microsoft created a technological breakthrough that would allow people converse with AI-powered chatbot in a more natural experience, similar to a person talking on the phone to a friend. Currently integrated into XiaoIce, the company aims to apply it to its other chatbots such as Zo.
Referring to it as full duplex, meaning the ability to communicate in both directions simultaneously, like a telephone call, users would not have to use a “wake word”, which is the case with most of the voice assistants currently available in the market. Di Li, Microsoft’s general manager for XiaoIce, had noted in a company blog post that these improvements are part of Microsoft’s effort to build AI-powered social chatbots that understands people’s emotional as well as intellectual needs. The acquisition of Semantic Machines is clearly aimed at strengthening these efforts by the company.
With this acquisition, Microsoft is also aiming to integrate a sense of humour, ability to chit-chat, play games, remember personal details, and others. Full duplex already allows these advances, which are very relaxed and natural.
On A Concluding Note
Apart from Microsoft, Google too, has been extensively working on life-like versions of its spoken AI. Earlier in May, Sundar Pichai, introduced “Duplex” at the Google I/O where he demonstrated how AI system could perform things like booking appointment, sounding every bit like a human. While it is an interesting idea to have a naturally conversant AI, we wait to see when it hits the market.