MITB Banner

Say Hello To STACL, Baidu’s New Innovation For Improving Simultaneous Translation

Share

United Nations(New York)

Translating a language is a tedious job which involves listening, speaking and a thorough command over the language. An assignment which needs a high degree of skills for humans to have might be a cakewalk for modern computer technologies such as machine learning or artificial intelligence to acquire.  

 

Baidu, in a recent development, has developed a device which uses ML and AI to translate languages, while boasting anticipation capabilities and controllable latency. It is an automated system that ensures a high-quality translation between two languages. It would prove to be highly advantageous against traditional consecutive interpretation, wherein a translator waits until the speaker pauses to start translating. While this method usually doubles the time needed, simultaneous interpretation is a faster option where the translator begins translating just a few seconds into the speaker’s speech.

Issues With Current Translation System

The usage of the verbs and figures of speeches may vary significantly in various languages. For example, in English, the verb comes prior to the sentence whereas in German it comes at the end of the sentence. Same with Chinese to English translation. This variance in order of words is a major hindrance for real-time human translators, causing undesirable latency and rendering speaker out of sync with the speaker. Simultaneous Translation with Anticipation and Controllable Latency (STACL) promises to address the issue of when and how to use words.

How Does It Work?

STACL works on the principle of predictive analysis. As the researchers from Baidu explain, the model doesn’t predict the source language words in the speaker’s speech but instead directly predicts the target language words in the translation. The model seamlessly fuses translation and anticipation in a single “wait-k” model. It means that translation is always k words behind the speaker’s speech to allow context for prediction. The model is trained to use the available prefix of the source sentence at each step to decide the next word in translation.

 

The researchers added that in the Chinese prefix Bùshí Zǒngtǒng zài Mòsīkē (“Bush President in Moscow”) and the English translation so far “President Bush” which is k=2 words behind Chinese, their system accurately predicts that the next translation word must be “meet” because Bush is likely “meeting” someone (e.g., Putin) in Moscow, which is done long before the verb appears.

 

The model, however, needs to be prepared about the speaker’s topic and style beforehand, just as human translators need to be. This is done by training with large amounts of data which have similar sentence structures. This enables the model to anticipate words in a sentence most likely to be spoken with a reasonable accuracy. With the current capabilities, STACL aims at dealing with latency.

How STACL Can Match Human Interpretation

Baidu has used a technology named 3.4 BLEU (Bilingual Evaluation Understudy), which is the backbone of the entire architecture. It is essentially a standard algorithm to estimate the quality of text which has been machine-translated from one natural language to another. “It a standard evaluation metric for full-sentence translation quality by comparing a machine translation result with a human reference translation”, notes the website.

 

While human translators can cover up to 60 percent of the source material with about three seconds delay, the new simultaneous system is much more efficient. While in the earlier Chinese to English simultaneous translation, the translator lagged behind by 3 Chinese words or about 1.5 to 2 seconds, the translation quality with new ML-model is about 5 BLEU points higher.

Outlook

While STACL shows significant potential, the researchers are still to overcome many limitations of the simultaneous machine translation system. The release of STACL is not proposed to take over the human interpreters yet but may use its capabilities to offer an improved service in the years to come.

Share
Picture of Bharat Adibhatla

Bharat Adibhatla

Bharat is a voracious reader of biographies and political tomes. He is also an avid astrologer and storyteller who is very active on social media.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.