Statistical Machine Translation by Parsing

Melamed, I. Dan; Wang, Wei

Computer Science > Computation and Language

arXiv:cs/0407005v1 (cs)

[Submitted on 1 Jul 2004 (this version), latest version 24 Nov 2005 (v3)]

Title:Statistical Machine Translation by Parsing

Authors:I. Dan Melamed, Wei Wang

View PDF

Abstract: Designers of statistical machine translation (SMT) systems have begun trying to exploit tree-structured syntactic information. This article offers a coherent algorithmic framework to facilitate such efforts. Our main contribution is a generalization of the common notion of parsing. In an ordinary parser, the input is a single string, and the grammar ranges over strings. In order to use syntactic information, an SMT system requires generalizations of ordinary parsing algorithms that allow the input to consist of string tuples and/or the grammar to range over string tuples. Three particular generalizations, connected by some trivial glue, are all that is necessary for syntax-aware SMT:
A synchronous parser is an algorithm that can infer the syntactic structure of each component text in a multitext and simultaneously infer the orrespondence relation between these structures.
When a parser's input can have fewer dimensions than the parser's grammar, it is a translator.
When a parser's grammar can have fewer dimensions than the parser's input, it is a synchronizer.
This article offers a guided tour of these generalized parsing algorithms. It culminates with a recipe for using generalized parsing algorithms to train and apply a syntax-aware SMT system.

Comments:	25 pages
Subjects:	Computation and Language (cs.CL)
ACM classes:	I.2.7
Cite as:	arXiv:cs/0407005 [cs.CL]
	(or arXiv:cs/0407005v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.cs/0407005

Submission history

From: Wei Wang [view email]
[v1] Thu, 1 Jul 2004 22:02:10 UTC (60 KB)
[v2] Wed, 23 Nov 2005 05:02:33 UTC (121 KB)
[v3] Thu, 24 Nov 2005 04:06:23 UTC (101 KB)

Computer Science > Computation and Language

Title:Statistical Machine Translation by Parsing

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Statistical Machine Translation by Parsing

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators