Last updated September 9, 2020

5 Research Papers on Computational Linguistics For Your Reading List

Published on May 20, 2019

by Akshaya Asokan

As one of the premier institutes for technology, Massachusetts Institute of Technology (MIT) has several prominent research which has resulted in many ground-breaking technological advancements.

In this article, we take a look at the top five recent research papers from on Computational Linguistics from the institute.

1) Learning an Executable Neural Semantic Parser

Authors: Jianpeng Cheng, Siva Reddy, Vijay Saraswat, and Mirella Lapata

Abstract: This article describes a neural semantic parser that maps natural language utterances ontological forms that can be executed against a task-specific environment, such as a knowledge base or a database, to produce a response. The parser generated tree-structured logical forms with a transition-based approach, combining a generic tree-generation algorithm with domain-general grammar defined by the logical language.

Research methodology: To tackle mismatches between natural language and logical form tokens, various attention mechanisms were explored. Finally, the researchers considered different training settings for the neural semantic parser, including fully supervised training where annotated logical forms were given, weakly supervised training where denotations were provided, and distant supervision where only unlabeled sentences and a knowledge base are available.

2) Unsupervised Compositionality Prediction of Nominal Compounds

Authors: Silvio Cordeiro, Aline Villavicencio, Marco Idiart and Carlos Ramisch

Abstract: Nominal compounds such as red wine and nut case display a continuum of compositionality, with varying contributions from the components of the compound to its semantics. This article proposes a framework for compound compositionality prediction using distributional semantic models, evaluating to what extent they capture idiomaticity compared to human judgments.

Research methodology: For evaluation, the researchers introduced data sets containing human judgments in three languages: English, French, and Portuguese. The results obtained reveal a high agreement between the models and human predictions, suggesting that they were able to incorporate information about idiomaticity.

3) Automatic Inference of Sound Correspondence Patterns across Multiple Languages

Authors: Johann-Mattis List

Abstract: The researcher presented an automatic method for the inference of sound correspondence patterns across multiple languages based on a network approach. The core idea was to represent all columns in aligned cognate sets as nodes in a network with edges representing the degree of compatibility between the nodes.

Research methodology: The task of inferring all compatible correspondence sets can then be handled as the well-known minimum clique cover problem in graph theory, which essentially seeks to split the graph into the smallest number of cliques in which each node is represented by exactly one clique. The resulting partitions represent all correspondence patterns that can be inferred for a given data set. By excluding those patterns that occur in only a few cognate sets, the core of regularly recurring sound correspondences can be inferred. Based on this idea, the article presents a method for automatic correspondence pattern recognition, which is implemented as part of a Python library which supplements the article.

4) A Sequential Matching Framework for Multi-Turn Response Selection in Retrieval-Based Chatbots

Authors: Yu Wu, Wei Wu, Chen Xing, Can Xu, Zhoujun Li, and Ming Zhou

Abstract: The researchers studied the problem of response selection for multi-turn conversation in retrieval-based chatbots. The task involved matching a response candidate with a conversation context, the challenges for which include how to recognize important parts of the context, and how to model the relationships among utterances in the context.

Research Methodology: Using a new matching framework called sequential matching framework (SMF), the researchers proposed a sequential convolutional network and sequential attention network and conducted experiments on two public data sets to test their performance. Experiment results show that both models can significantly outperform state-of-the-art matching methods. The researchers also show that the models are interpretable with visualisations that provide us insights on how they capture and leverage important information in contexts for matching.

5)Parsing Chinese Sentences with Grammatical Relations

Authors: Weiwei Sun, Yufei Chen, Xiaojun Wan and Meichun Liu

Abstract: The research represents grammatical information using general directed dependency graphs. Both only-local and rich long-distance dependencies are explicitly represented.

Research methodology: To create high-quality annotations, the researchers took advantage of an existing TreeBank, namely, Chinese TreeBank (CTB), which is grounded on the Government and Binding theory. Two key problems as addressed by the researchers include (a) how to decompose a complex graph into simple subgraphs, and (b) how to combine subgraphs into a coherent complex graph. For transition-based parsing, the researchers introduced a neural parser based on a list-based transition system. They also discussed several other key problems, including dynamic oracle and beam search for neural transition-based parsing. The evaluation gauged how successful GR parsing for Chinese can be by applying data-driven models. The empirical analysis suggests several directions for future study.

PS: The story was written using a keyboard.

Access all our open Survey & Awards Nomination forms in one place

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

The Impact of Lok Sabha Election on India’s AI Progress

Vidyashree Srinivas

The BJP aims to safeguard citizen safety and privacy, leaning towards regulation, while the Congress views AI advancements as an opportunity to create jobs.