Can current NLI systems handle German word order? Investigating language model performance on a new German challenge set of minimal pairs

Reinig, Ines; Markert, Katja

Computer Science > Computation and Language

arXiv:2306.04523 (cs)

[Submitted on 7 Jun 2023]

Title:Can current NLI systems handle German word order? Investigating language model performance on a new German challenge set of minimal pairs

Authors:Ines Reinig, Katja Markert

View PDF

Abstract:Compared to English, German word order is freer and therefore poses additional challenges for natural language inference (NLI). We create WOGLI (Word Order in German Language Inference), the first adversarial NLI dataset for German word order that has the following properties: (i) each premise has an entailed and a non-entailed hypothesis; (ii) premise and hypotheses differ only in word order and necessary morphological changes to mark case and number. In particular, each premise andits two hypotheses contain exactly the same lemmata. Our adversarial examples require the model to use morphological markers in order to recognise or reject entailment. We show that current German autoencoding models fine-tuned on translated NLI data can struggle on this challenge set, reflecting the fact that translated NLI datasets will not mirror all necessary language phenomena in the target language. We also examine performance after data augmentation as well as on related word order phenomena derived from WOGLI. Our datasets are publically available at this https URL.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2306.04523 [cs.CL]
	(or arXiv:2306.04523v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2306.04523

Submission history

From: Ines Reinig [view email]
[v1] Wed, 7 Jun 2023 15:33:07 UTC (60 KB)

Computer Science > Computation and Language

Title:Can current NLI systems handle German word order? Investigating language model performance on a new German challenge set of minimal pairs

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Can current NLI systems handle German word order? Investigating language model performance on a new German challenge set of minimal pairs

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators