-
Formal Proofs as Structured Explanations: Proposing Several Tasks on Explainable Natural Language Inference
Authors:
Lasha Abzianidze
Abstract:
In this position paper, we propose a reasoning framework that can model the reasoning process underlying natural language inferences. The framework is based on the semantic tableau method, a well-studied proof system in formal logic. Like the semantic tableau, the framework is driven by refutation -- something is proved if and only if its counterexample was not refuted. Despite being rooted in for…
▽ More
In this position paper, we propose a reasoning framework that can model the reasoning process underlying natural language inferences. The framework is based on the semantic tableau method, a well-studied proof system in formal logic. Like the semantic tableau, the framework is driven by refutation -- something is proved if and only if its counterexample was not refuted. Despite being rooted in formal logic, the framework shares similarities with the mental models, a theory on the psychology of reasoning. We will show how the reasoning framework can facilitate the collection of comprehensive and structured explanations for existing naturalistic inference problems. To make the suggestion more concrete, we propose a method of semi-automatically obtaining structured explanations from the formal proofs of a reliable and high-performing logic-based inference system. Taking advantage of the in-depth information available in the generated formal proofs, we show how it can be used to define natural language reasoning tasks with structured explanations. The proposed tasks can be ordered according to difficulty defined in terms of the granularity of explanations. We argue that the tasks that contain a natural sketch of the proofs will suffer from substantially fewer shortcomings than the existing explainable reasoning tasks (or datasets).
△ Less
Submitted 6 February, 2025; v1 submitted 14 November, 2023;
originally announced November 2023.
-
SpaceNLI: Evaluating the Consistency of Predicting Inferences in Space
Authors:
Lasha Abzianidze,
Joost Zwarts,
Yoad Winter
Abstract:
While many natural language inference (NLI) datasets target certain semantic phenomena, e.g., negation, tense & aspect, monotonicity, and presupposition, to the best of our knowledge, there is no NLI dataset that involves diverse types of spatial expressions and reasoning. We fill this gap by semi-automatically creating an NLI dataset for spatial reasoning, called SpaceNLI. The data samples are au…
▽ More
While many natural language inference (NLI) datasets target certain semantic phenomena, e.g., negation, tense & aspect, monotonicity, and presupposition, to the best of our knowledge, there is no NLI dataset that involves diverse types of spatial expressions and reasoning. We fill this gap by semi-automatically creating an NLI dataset for spatial reasoning, called SpaceNLI. The data samples are automatically generated from a curated set of reasoning patterns, where the patterns are annotated with inference labels by experts. We test several SOTA NLI systems on SpaceNLI to gauge the complexity of the dataset and the system's capacity for spatial reasoning. Moreover, we introduce a Pattern Accuracy and argue that it is a more reliable and stricter measure than the accuracy for evaluating a system's performance on pattern-based generated data samples. Based on the evaluation results we find that the systems obtain moderate results on the spatial NLI problems but lack consistency per inference pattern. The results also reveal that non-projective spatial inferences (especially due to the "between" preposition) are the most challenging ones.
△ Less
Submitted 5 July, 2023;
originally announced July 2023.
-
A Logic-Based Framework for Natural Language Inference in Dutch
Authors:
Lasha Abzianidze,
Konstantinos Kogkalidis
Abstract:
We present a framework for deriving inference relations between Dutch sentence pairs. The proposed framework relies on logic-based reasoning to produce inspectable proofs leading up to inference labels; its judgements are therefore transparent and formally verifiable. At its core, the system is powered by two $λ$-calculi, used as syntactic and semantic theories, respectively. Sentences are first c…
▽ More
We present a framework for deriving inference relations between Dutch sentence pairs. The proposed framework relies on logic-based reasoning to produce inspectable proofs leading up to inference labels; its judgements are therefore transparent and formally verifiable. At its core, the system is powered by two $λ$-calculi, used as syntactic and semantic theories, respectively. Sentences are first converted to syntactic proofs and terms of the linear $λ$-calculus using a choice of two parsers: an Alpino-based pipeline, and Neural Proof Nets. The syntactic terms are then converted to semantic terms of the simply typed $λ$-calculus, via a set of hand designed type- and term-level transformations. Pairs of semantic terms are then fed to an automated theorem prover for natural logic which reasons with them while using the lexical relations found in the Open Dutch WordNet. We evaluate the reasoning pipeline on the recently created Dutch natural language inference dataset, and achieve promising results, remaining only within a $1.1-3.2{\%}$ performance margin to strong neural baselines. To the best of our knowledge, the reasoning pipeline is the first logic-based system for Dutch.
△ Less
Submitted 14 January, 2022; v1 submitted 7 October, 2021;
originally announced October 2021.
-
The Parallel Meaning Bank: A Framework for Semantically Annotating Multiple Languages
Authors:
Lasha Abzianidze,
Rik van Noord,
Chunliu Wang,
Johan Bos
Abstract:
This paper gives a general description of the ideas behind the Parallel Meaning Bank, a framework with the aim to provide an easy way to annotate compositional semantics for texts written in languages other than English. The annotation procedure is semi-automatic, and comprises seven layers of linguistic information: segmentation, symbolisation, semantic tagging, word sense disambiguation, syntact…
▽ More
This paper gives a general description of the ideas behind the Parallel Meaning Bank, a framework with the aim to provide an easy way to annotate compositional semantics for texts written in languages other than English. The annotation procedure is semi-automatic, and comprises seven layers of linguistic information: segmentation, symbolisation, semantic tagging, word sense disambiguation, syntactic structure, thematic role labelling, and co-reference. New languages can be added to the meaning bank as long as the documents are based on translations from English, but also introduce new interesting challenges on the linguistics assumptions underlying the Parallel Meaning Bank.
△ Less
Submitted 29 December, 2020;
originally announced December 2020.
-
DRS at MRP 2020: Dressing up Discourse Representation Structures as Graphs
Authors:
Lasha Abzianidze,
Johan Bos,
Stephan Oepen
Abstract:
Discourse Representation Theory (DRT) is a formal account for representing the meaning of natural language discourse. Meaning in DRT is modeled via a Discourse Representation Structure (DRS), a meaning representation with a model-theoretic interpretation, which is usually depicted as nested boxes. In contrast, a directed labeled graph is a common data structure used to encode semantics of natural…
▽ More
Discourse Representation Theory (DRT) is a formal account for representing the meaning of natural language discourse. Meaning in DRT is modeled via a Discourse Representation Structure (DRS), a meaning representation with a model-theoretic interpretation, which is usually depicted as nested boxes. In contrast, a directed labeled graph is a common data structure used to encode semantics of natural language texts. The paper describes the procedure of dressing up DRSs as directed labeled graphs to include DRT as a new framework in the 2020 shared task on Cross-Framework and Cross-Lingual Meaning Representation Parsing. Since one of the goals of the shared task is to encourage unified models for several semantic graph frameworks, the conversion procedure was biased towards making the DRT graph framework somewhat similar to other graph-based meaning representation frameworks.
△ Less
Submitted 29 December, 2020;
originally announced December 2020.
-
Learning as Abduction: Trainable Natural Logic Theorem Prover for Natural Language Inference
Authors:
Lasha Abzianidze
Abstract:
Tackling Natural Language Inference with a logic-based method is becoming less and less common. While this might have been counterintuitive several decades ago, nowadays it seems pretty obvious. The main reasons for such a conception are that (a) logic-based methods are usually brittle when it comes to processing wide-coverage texts, and (b) instead of automatically learning from data, they requir…
▽ More
Tackling Natural Language Inference with a logic-based method is becoming less and less common. While this might have been counterintuitive several decades ago, nowadays it seems pretty obvious. The main reasons for such a conception are that (a) logic-based methods are usually brittle when it comes to processing wide-coverage texts, and (b) instead of automatically learning from data, they require much of manual effort for development. We make a step towards to overcome such shortcomings by modeling learning from data as abduction: reversing a theorem-proving procedure to abduce semantic relations that serve as the best explanation for the gold label of an inference problem. In other words, instead of proving sentence-level inference relations with the help of lexical relations, the lexical relations are proved taking into account the sentence-level inference relations. We implement the learning method in a tableau theorem prover for natural language and show that it improves the performance of the theorem prover on the SICK dataset by 1.4% while still maintaining high precision (>94%). The obtained results are competitive with the state of the art among logic-based systems.
△ Less
Submitted 1 December, 2020; v1 submitted 29 October, 2020;
originally announced October 2020.
-
Thirty Musts for Meaning Banking
Authors:
Johan Bos,
Lasha Abzianidze
Abstract:
Meaning banking--creating a semantically annotated corpus for the purpose of semantic parsing or generation--is a challenging task. It is quite simple to come up with a complex meaning representation, but it is hard to design a simple meaning representation that captures many nuances of meaning. This paper lists some lessons learned in nearly ten years of meaning annotation during the development…
▽ More
Meaning banking--creating a semantically annotated corpus for the purpose of semantic parsing or generation--is a challenging task. It is quite simple to come up with a complex meaning representation, but it is hard to design a simple meaning representation that captures many nuances of meaning. This paper lists some lessons learned in nearly ten years of meaning annotation during the development of the Groningen Meaning Bank (Bos et al., 2017) and the Parallel Meaning Bank (Abzianidze et al., 2017). The paper's format is rather unconventional: there is no explicit related work, no methodology section, no results, and no discussion (and the current snippet is not an abstract but actually an introductory preface). Instead, its structure is inspired by work of Traum (2000) and Bender (2013). The list starts with a brief overview of the existing meaning banks (Section 1) and the rest of the items are roughly divided into three groups: corpus collection (Section 2 and 3, annotation methods (Section 4-11), and design of meaning representations (Section 12-30). We hope this overview will give inspiration and guidance in creating improved meaning banks in the future.
△ Less
Submitted 27 May, 2020;
originally announced May 2020.
-
The First Shared Task on Discourse Representation Structure Parsing
Authors:
Lasha Abzianidze,
Rik van Noord,
Hessel Haagsma,
Johan Bos
Abstract:
The paper presents the IWCS 2019 shared task on semantic parsing where the goal is to produce Discourse Representation Structures (DRSs) for English sentences. DRSs originate from Discourse Representation Theory and represent scoped meaning representations that capture the semantics of negation, modals, quantification, and presupposition triggers. Additionally, concepts and event-participants in D…
▽ More
The paper presents the IWCS 2019 shared task on semantic parsing where the goal is to produce Discourse Representation Structures (DRSs) for English sentences. DRSs originate from Discourse Representation Theory and represent scoped meaning representations that capture the semantics of negation, modals, quantification, and presupposition triggers. Additionally, concepts and event-participants in DRSs are described with WordNet synsets and the thematic roles from VerbNet. To measure similarity between two DRSs, they are represented in a clausal form, i.e. as a set of tuples. Participant systems were expected to produce DRSs in this clausal form. Taking into account the rich lexical information, explicit scope marking, a high number of shared variables among clauses, and highly-constrained format of valid DRSs, all these makes the DRS parsing a challenging NLP task. The results of the shared task displayed improvements over the existing state-of-the-art parser.
△ Less
Submitted 27 May, 2020;
originally announced May 2020.
-
Can neural networks understand monotonicity reasoning?
Authors:
Hitomi Yanaka,
Koji Mineshima,
Daisuke Bekki,
Kentaro Inui,
Satoshi Sekine,
Lasha Abzianidze,
Johan Bos
Abstract:
Monotonicity reasoning is one of the important reasoning skills for any intelligent natural language inference (NLI) model in that it requires the ability to capture the interaction between lexical and syntactic structures. Since no test set has been developed for monotonicity reasoning with wide coverage, it is still unclear whether neural models can perform monotonicity reasoning in a proper way…
▽ More
Monotonicity reasoning is one of the important reasoning skills for any intelligent natural language inference (NLI) model in that it requires the ability to capture the interaction between lexical and syntactic structures. Since no test set has been developed for monotonicity reasoning with wide coverage, it is still unclear whether neural models can perform monotonicity reasoning in a proper way. To investigate this issue, we introduce the Monotonicity Entailment Dataset (MED). Performance by state-of-the-art NLI models on the new test set is substantially worse, under 55%, especially on downward reasoning. In addition, analysis using a monotonicity-driven data augmentation method showed that these models might be limited in their generalization ability in upward and downward reasoning.
△ Less
Submitted 27 June, 2019; v1 submitted 14 June, 2019;
originally announced June 2019.
-
HELP: A Dataset for Identifying Shortcomings of Neural Models in Monotonicity Reasoning
Authors:
Hitomi Yanaka,
Koji Mineshima,
Daisuke Bekki,
Kentaro Inui,
Satoshi Sekine,
Lasha Abzianidze,
Johan Bos
Abstract:
Large crowdsourced datasets are widely used for training and evaluating neural models on natural language inference (NLI). Despite these efforts, neural models have a hard time capturing logical inferences, including those licensed by phrase replacements, so-called monotonicity reasoning. Since no large dataset has been developed for monotonicity reasoning, it is still unclear whether the main obs…
▽ More
Large crowdsourced datasets are widely used for training and evaluating neural models on natural language inference (NLI). Despite these efforts, neural models have a hard time capturing logical inferences, including those licensed by phrase replacements, so-called monotonicity reasoning. Since no large dataset has been developed for monotonicity reasoning, it is still unclear whether the main obstacle is the size of datasets or the model architectures themselves. To investigate this issue, we introduce a new dataset, called HELP, for handling entailments with lexical and logical phenomena. We add it to training data for the state-of-the-art neural models and evaluate them on test sets for monotonicity phenomena. The results showed that our data augmentation improved the overall accuracy. We also find that the improvement is better on monotonicity inferences with lexical replacements than on downward inferences with disjunction and modification. This suggests that some types of inferences can be improved by our data augmentation while others are immune to it.
△ Less
Submitted 27 April, 2019;
originally announced April 2019.
-
Exploring Neural Methods for Parsing Discourse Representation Structures
Authors:
Rik van Noord,
Lasha Abzianidze,
Antonio Toral,
Johan Bos
Abstract:
Neural methods have had several recent successes in semantic parsing, though they have yet to face the challenge of producing meaning representations based on formal semantics. We present a sequence-to-sequence neural semantic parser that is able to produce Discourse Representation Structures (DRSs) for English sentences with high accuracy, outperforming traditional DRS parsers. To facilitate the…
▽ More
Neural methods have had several recent successes in semantic parsing, though they have yet to face the challenge of producing meaning representations based on formal semantics. We present a sequence-to-sequence neural semantic parser that is able to produce Discourse Representation Structures (DRSs) for English sentences with high accuracy, outperforming traditional DRS parsers. To facilitate the learning of the output, we represent DRSs as a sequence of flat clauses and introduce a method to verify that produced DRSs are well-formed and interpretable. We compare models using characters and words as input and see (somewhat surprisingly) that the former performs better than the latter. We show that eliminating variable names from the output using De Bruijn-indices increases parser performance. Adding silver training data boosts performance even further.
△ Less
Submitted 30 October, 2018;
originally announced October 2018.
-
What can we learn from Semantic Tagging?
Authors:
Mostafa Abdou,
Artur Kulmizev,
Vinit Ravishankar,
Lasha Abzianidze,
Johan Bos
Abstract:
We investigate the effects of multi-task learning using the recently introduced task of semantic tagging. We employ semantic tagging as an auxiliary task for three different NLP tasks: part-of-speech tagging, Universal Dependency parsing, and Natural Language Inference. We compare full neural network sharing, partial neural network sharing, and what we term the learning what to share setting where…
▽ More
We investigate the effects of multi-task learning using the recently introduced task of semantic tagging. We employ semantic tagging as an auxiliary task for three different NLP tasks: part-of-speech tagging, Universal Dependency parsing, and Natural Language Inference. We compare full neural network sharing, partial neural network sharing, and what we term the learning what to share setting where negative transfer between tasks is less likely. Our findings show considerable improvements for all tasks, particularly in the learning what to share setting, which shows consistent gains across all tasks.
△ Less
Submitted 29 August, 2018;
originally announced August 2018.
-
Evaluating Scoped Meaning Representations
Authors:
Rik van Noord,
Lasha Abzianidze,
Hessel Haagsma,
Johan Bos
Abstract:
Semantic parsing offers many opportunities to improve natural language understanding. We present a semantically annotated parallel corpus for English, German, Italian, and Dutch where sentences are aligned with scoped meaning representations in order to capture the semantics of negation, modals, quantification, and presupposition triggers. The semantic formalism is based on Discourse Representatio…
▽ More
Semantic parsing offers many opportunities to improve natural language understanding. We present a semantically annotated parallel corpus for English, German, Italian, and Dutch where sentences are aligned with scoped meaning representations in order to capture the semantics of negation, modals, quantification, and presupposition triggers. The semantic formalism is based on Discourse Representation Theory, but concepts are represented by WordNet synsets and thematic roles by VerbNet relations. Translating scoped meaning representations to sets of clauses enables us to compare them for the purpose of semantic parser evaluation and checking translations. This is done by computing precision and recall on matching clauses, in a similar way as is done for Abstract Meaning Representations. We show that our matching tool for evaluating scoped meaning representations is both accurate and efficient. Applying this matching tool to three baseline semantic parsers yields F-scores between 43% and 54%. A pilot study is performed to automatically find changes in meaning by comparing meaning representations of translations. This comparison turns out to be an additional way of (i) finding annotation mistakes and (ii) finding instances where our semantic analysis needs to be improved.
△ Less
Submitted 10 April, 2018; v1 submitted 23 February, 2018;
originally announced February 2018.
-
Towards Universal Semantic Tagging
Authors:
Lasha Abzianidze,
Johan Bos
Abstract:
The paper proposes the task of universal semantic tagging---tagging word tokens with language-neutral, semantically informative tags. We argue that the task, with its independent nature, contributes to better semantic analysis for wide-coverage multilingual text. We present the initial version of the semantic tagset and show that (a) the tags provide semantically fine-grained information, and (b)…
▽ More
The paper proposes the task of universal semantic tagging---tagging word tokens with language-neutral, semantically informative tags. We argue that the task, with its independent nature, contributes to better semantic analysis for wide-coverage multilingual text. We present the initial version of the semantic tagset and show that (a) the tags provide semantically fine-grained information, and (b) they are suitable for cross-lingual semantic parsing. An application of the semantic tagging in the Parallel Meaning Bank supports both of these points as the tags contribute to formal lexical semantics and their cross-lingual projection. As a part of the application, we annotate a small corpus with the semantic tags and present new baseline result for universal semantic tagging.
△ Less
Submitted 29 September, 2017;
originally announced September 2017.
-
LangPro: Natural Language Theorem Prover
Authors:
Lasha Abzianidze
Abstract:
LangPro is an automated theorem prover for natural language (https://github.com/kovvalsky/LangPro). Given a set of premises and a hypothesis, it is able to prove semantic relations between them. The prover is based on a version of analytic tableau method specially designed for natural logic. The proof procedure operates on logical forms that preserve linguistic expressions to a large extent. %This…
▽ More
LangPro is an automated theorem prover for natural language (https://github.com/kovvalsky/LangPro). Given a set of premises and a hypothesis, it is able to prove semantic relations between them. The prover is based on a version of analytic tableau method specially designed for natural logic. The proof procedure operates on logical forms that preserve linguistic expressions to a large extent. %This property makes the logical forms easily obtainable from syntactic trees. %, in particular, Combinatory Categorial Grammar derivation trees. The nature of proofs is deductive and transparent. On the FraCaS and SICK textual entailment datasets, the prover achieves high results comparable to state-of-the-art.
△ Less
Submitted 30 August, 2017;
originally announced August 2017.
-
The Parallel Meaning Bank: Towards a Multilingual Corpus of Translations Annotated with Compositional Meaning Representations
Authors:
Lasha Abzianidze,
Johannes Bjerva,
Kilian Evang,
Hessel Haagsma,
Rik van Noord,
Pierre Ludmann,
Duc-Duy Nguyen,
Johan Bos
Abstract:
The Parallel Meaning Bank is a corpus of translations annotated with shared, formal meaning representations comprising over 11 million words divided over four languages (English, German, Italian, and Dutch). Our approach is based on cross-lingual projection: automatically produced (and manually corrected) semantic annotations for English sentences are mapped onto their word-aligned translations, a…
▽ More
The Parallel Meaning Bank is a corpus of translations annotated with shared, formal meaning representations comprising over 11 million words divided over four languages (English, German, Italian, and Dutch). Our approach is based on cross-lingual projection: automatically produced (and manually corrected) semantic annotations for English sentences are mapped onto their word-aligned translations, assuming that the translations are meaning-preserving. The semantic annotation consists of five main steps: (i) segmentation of the text in sentences and lexical items; (ii) syntactic parsing with Combinatory Categorial Grammar; (iii) universal semantic tagging; (iv) symbolization; and (v) compositional semantic analysis based on Discourse Representation Theory. These steps are performed using statistical models trained in a semi-supervised manner. The employed annotation models are all language-neutral. Our first results are promising.
△ Less
Submitted 13 February, 2017;
originally announced February 2017.