MITB Banner

5 Research Papers on Computational Linguistics For Your Reading List

Share
Image source: MIT News

As one of the premier institutes for technology, Massachusetts Institute of Technology (MIT) has several prominent research which has resulted in many ground-breaking technological advancements.

In this article, we take a look at the top five recent research papers from on Computational Linguistics from the institute.

1) Learning an Executable Neural Semantic Parser

Authors: Jianpeng Cheng, Siva Reddy, Vijay Saraswat, and Mirella Lapata

Abstract: This article describes a neural semantic parser that maps natural language utterances ontological forms that can be executed against a task-specific environment, such as a knowledge base or a database, to produce a response. The parser generated tree-structured logical forms with a transition-based approach, combining a generic tree-generation algorithm with domain-general grammar defined by the logical language.

Research methodology: To tackle mismatches between natural language and logical form tokens, various attention mechanisms were explored. Finally, the researchers considered different training settings for the neural semantic parser, including fully supervised training where annotated logical forms were given, weakly supervised training where denotations were provided, and distant supervision where only unlabeled sentences and a knowledge base are available.  

2) Unsupervised Compositionality Prediction of Nominal Compounds

Authors: Silvio Cordeiro, Aline Villavicencio, Marco Idiart and Carlos Ramisch

Abstract: Nominal compounds such as red wine and nut case display a continuum of compositionality, with varying contributions from the components of the compound to its semantics. This article proposes a framework for compound compositionality prediction using distributional semantic models, evaluating to what extent they capture idiomaticity compared to human judgments.

Research methodology: For evaluation, the researchers introduced data sets containing human judgments in three languages: English, French, and Portuguese. The results obtained reveal a high agreement between the models and human predictions, suggesting that they were able to incorporate information about idiomaticity.

3) Automatic Inference of Sound Correspondence Patterns across Multiple Languages

Authors: Johann-Mattis List

Abstract: The researcher presented an automatic method for the inference of sound correspondence patterns across multiple languages based on a network approach. The core idea was to represent all columns in aligned cognate sets as nodes in a network with edges representing the degree of compatibility between the nodes.

Research methodology: The task of inferring all compatible correspondence sets can then be handled as the well-known minimum clique cover problem in graph theory, which essentially seeks to split the graph into the smallest number of cliques in which each node is represented by exactly one clique. The resulting partitions represent all correspondence patterns that can be inferred for a given data set. By excluding those patterns that occur in only a few cognate sets, the core of regularly recurring sound correspondences can be inferred. Based on this idea, the article presents a method for automatic correspondence pattern recognition, which is implemented as part of a Python library which supplements the article.

4) A Sequential Matching Framework for Multi-Turn Response Selection in Retrieval-Based Chatbots

Authors: Yu Wu, Wei Wu, Chen Xing, Can Xu, Zhoujun Li, and Ming Zhou

Abstract: The researchers studied the problem of response selection for multi-turn conversation in retrieval-based chatbots. The task involved matching a response candidate with a conversation context, the challenges for which include how to recognize important parts of the context, and how to model the relationships among utterances in the context.

Research Methodology: Using a new matching framework called sequential matching framework (SMF), the researchers proposed a sequential convolutional network and sequential attention network and conducted experiments on two public data sets to test their performance. Experiment results show that both models can significantly outperform state-of-the-art matching methods. The researchers also show that the models are interpretable with visualisations that provide us insights on how they capture and leverage important information in contexts for matching.

5)Parsing Chinese Sentences with Grammatical Relations

Authors: Weiwei Sun, Yufei Chen, Xiaojun Wan and Meichun Liu

Abstract:  The research represents grammatical information using general directed dependency graphs. Both only-local and rich long-distance dependencies are explicitly represented.

Research methodology: To create high-quality annotations, the researchers took advantage of an existing TreeBank, namely, Chinese TreeBank (CTB), which is grounded on the Government and Binding theory. Two key problems as addressed by the researchers include (a) how to decompose a complex graph into simple subgraphs, and (b) how to combine subgraphs into a coherent complex graph. For transition-based parsing, the researchers introduced a neural parser based on a list-based transition system. They also discussed several other key problems, including dynamic oracle and beam search for neural transition-based parsing. The evaluation gauged how successful GR parsing for Chinese can be by applying data-driven models. The empirical analysis suggests several directions for future study.

 

PS: The story was written using a keyboard.
Share
Picture of Akshaya Asokan

Akshaya Asokan

Akshaya Asokan works as a Technology Journalist at Analytics India Magazine. She has previously worked with IDG Media and The New Indian Express. When not writing, she can be seen either reading or staring at a flower.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India