In a bid to provide improved real time translations in Indian languages, Microsoft has announced the integration of artificial intelligence (AI) and Deep Neural Networks in the translation of three languages — Tamil, Hindi and Bengali.
Microsoft’s involvement with Indian languages can be traced back to ‘Project Bhasha’, which was launched in 1998 to advance computing in Indian languages.
Sundar Srinivasan, General Manager, AI and Research, Microsoft India, told a news wire, “Microsoft celebrates the diversity of languages in India and wants to make the vast Internet even more accessible. We have supported Indian languages in computing for over two decades, and, more recently, have made significant strides on voice-based access and machine translation across languages.”
Microsoft currently supports Windows interface in 12 languages, and textual input in 22 officially recognised languages across its products.
The AI-enabled translation feature can be made use of on Microsoft Office 365 products like Excel, PowerPoint, Word, Skype and Outlook. Internet users can also benefit from it on any website that is surfed through Bing search, Bing Translator and Microsoft Edge browser. In addition to that, the Microsoft Translator app available on iOS and Android, which can discern and translate languages from speech, text and photos.
Microsoft has been perfecting traditional Statistical Machine Translation(SMT) for the translation of various global languages since the early 2000s. The employment of Deep Neural Networks in addition to SMT, brings with it the ability of encoding more technical abstractions such as nature of the word (adjective, noun), level of formality (formal, informal, slang) and gender (masculine, feminine).
This will improve the quality of translations significantly compared to SMT only translations, as the use of only SMT limits translation of a word inside the local context around a few neighbouring words.
The company has also stated that the translation of Indian languages presents a tough challenge due to the Aryan and Dravidian subdivisions among them. It added that the level of complexity is further compounded as there are 22 official languages across 29 states.
Another major hurdle faced is the lack of digital material available on the internet to train networks in Indian language translation.
“Six Indian languages are part of top 20 global languages by population. Ironically, these languages are not on top of the digital content list. There’s not enough material on Internet that we could use to train the system,” added Krishna Doss Mohan, Senior Programme Manager, Microsoft India, a member of the team devoted to Indian languages.
In spite of these hurdles, the company claims that there has been a nearly 20 percent increase in the quality of translations in Indian languages supported by the company. This includes improvement in both human and automatic evaluation metrics for the Deep Neural Networks-powered translation systems.