A Survey on Document-level Neural Machine Translation: Methods and Evaluation

Maruf, Sameen; Saleh, Fahimeh; Haffari, Gholamreza

Computer Science > Computation and Language

arXiv:1912.08494 (cs)

[Submitted on 18 Dec 2019 (v1), last revised 13 Jan 2021 (this version, v3)]

Title:A Survey on Document-level Neural Machine Translation: Methods and Evaluation

Authors:Sameen Maruf, Fahimeh Saleh, Gholamreza Haffari

View PDF

Abstract:Machine translation (MT) is an important task in natural language processing (NLP) as it automates the translation process and reduces the reliance on human translators. With the resurgence of neural networks, the translation quality surpasses that of the translations obtained using statistical techniques for most language-pairs. Up until a few years ago, almost all of the neural translation models translated sentences independently, without incorporating the wider document-context and inter-dependencies among the sentences. The aim of this survey paper is to highlight the major works that have been undertaken in the space of document-level machine translation after the neural revolution, so that researchers can recognise the current state and future directions of this field. We provide an organisation of the literature based on novelties in modelling and architectures as well as training and decoding strategies. In addition, we cover evaluation strategies that have been introduced to account for the improvements in document MT, including automatic metrics and discourse-targeted test sets. We conclude by presenting possible avenues for future exploration in this research field.

Comments:	Accepted for publication by ACM Computing Surveys
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:1912.08494 [cs.CL]
	(or arXiv:1912.08494v3 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.1912.08494

Submission history

From: Sameen Maruf [view email]
[v1] Wed, 18 Dec 2019 10:07:20 UTC (73 KB)
[v2] Sun, 11 Oct 2020 23:10:22 UTC (520 KB)
[v3] Wed, 13 Jan 2021 00:31:53 UTC (525 KB)

Computer Science > Computation and Language

Title:A Survey on Document-level Neural Machine Translation: Methods and Evaluation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:A Survey on Document-level Neural Machine Translation: Methods and Evaluation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators