Going beyond simple tasks like playing a song or booking an appointment requires generating coherent and engaging responses in conversations requires a range of nuanced conversational skills, including language understanding and reasoning.
If AI has to master the art of conversation at human level, then it has an uphill task ahead and Facebook AI lists 5 areas where chatbots can improve:
- knowledgeability, and
- multimodal understanding.
When Chatbots Empathise
In recent work with researchers from the University of Washington, introduce the first benchmark task of human-written empathetic dialogues centered on specific emotional labels to measure a chatbot’s ability to display empathy.
We can’t be too hard on AI in this regard as we humans, too, are bad at showing empathy but without proper training data, a chatbot can convey congratulations in place of condolences. Training AI for empathy is one of the final frontiers of AGI, and is still debatable. Nevertheless, since, the improvements will be made to the models, we need to address this aspect of NLP models as well.
Current state-of-the-art approaches to dialogue modeling involve sequence-to-sequence models, which lack access to information outside of the conversation history.
Now, new models are created with the help of Wikipedia data to retrieve knowledge and then use it to respond to dialogues.
Connecting With Imagery
A picture speaks a thousand words. This cliche is in its truest form in the machine world where every pixel counts. It can be an edge, curve or color intensity, the information, indeed is worth a thousand words. Modern neural networks especially CNNs have mastered object detection and this aspect can be exploited to improve chatbot conversation.
Machine learning approaches that comment on images have typically focused on image captioning, which is factual and neutral in tone — like “fireworks in the sky.” In this research, the authors focussed on image captioning that is engaging for humans by incorporating personality. A large data set of human comments grounded in images was collected, and train state-of-the-art models capable of discussing images with given personalities, which makes the system much more interesting for humans to talk to.
Dialogue natural language inference dataset was specifically created to test techniques that maintain consistency. In Dialogue NLI, two utterances in a dialogue are considered as the premise and hypothesis, respectively. Each pair is labeled to indicate whether the premise entails, contradicts, or is neutral with respect to the hypothesis.
Training an NLI model on this data set and using it to rerank the model’s responses to entail previous dialogues — or maintain consistency with them — improves the overall consistency of the dialogue agent.
“What do you do for a living?” A typical chatbot responds with the generic statement “I’m a construction worker.” With control methods like the one done in collaboration with Stanford AI, the chatbots proposed more specific and engaging responses, like “I build antique homes and refurbish houses.”
Studying multiturn aspects is necessary to improve conversation quality. The better the overall conversation flow, the more engaging and personable the chatbots and dialogue agents of the future will be.
A Future Of Intelligent Interactions
It is possible to train models to improve on some of the most common weaknesses of chatbots today. The future, intelligent chatbots will be capable of open-domain dialogue in a way that’s personable, consistent, empathetic, and engaging.
The implications of conversational AI agents can be expected to be beyond the boundaries of trivial social networking chats. They could be deployed for healthcare emergency scenario where the user can type in some text or send an image indicating a distressed signal. Even if we move away from these grave scenarios, enterprises have a lot to gain from intelligent chatbots. They can be deployed for real time grievance redressal (ex:Swiggy and Zomato) or banks and insurance agencies can deploy them for quick loan approvals and answering queries. This eliminates the down time and captures a wider range of population.
Access more info on open source datasets and methods here.