Generalization without systematicity: On the compositional skills of sequence-to-sequence recurrent networks

Lake, Brenden M.; Baroni, Marco

Computer Science > Computation and Language

arXiv:1711.00350 (cs)

[Submitted on 31 Oct 2017 (v1), last revised 6 Jun 2018 (this version, v3)]

Title:Generalization without systematicity: On the compositional skills of sequence-to-sequence recurrent networks

Authors:Brenden M. Lake, Marco Baroni

View PDF

Abstract:Humans can understand and produce new utterances effortlessly, thanks to their compositional skills. Once a person learns the meaning of a new verb "dax," he or she can immediately understand the meaning of "dax twice" or "sing and dax." In this paper, we introduce the SCAN domain, consisting of a set of simple compositional navigation commands paired with the corresponding action sequences. We then test the zero-shot generalization capabilities of a variety of recurrent neural networks (RNNs) trained on SCAN with sequence-to-sequence methods. We find that RNNs can make successful zero-shot generalizations when the differences between training and test commands are small, so that they can apply "mix-and-match" strategies to solve the task. However, when generalization requires systematic compositional skills (as in the "dax" example above), RNNs fail spectacularly. We conclude with a proof-of-concept experiment in neural machine translation, suggesting that lack of systematicity might be partially responsible for neural networks' notorious training data thirst.

Comments:	Published at the 35th International Conference on Machine Learning (ICML 2018)
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:1711.00350 [cs.CL]
	(or arXiv:1711.00350v3 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.1711.00350
Journal reference:	Lake, B. M. and Baroni, M. (2018). Generalization without systematicity: On the compositional skills of sequence-to-sequence recurrent networks. International Conference on Machine Learning (ICML)

Submission history

From: Brenden Lake [view email]
[v1] Tue, 31 Oct 2017 01:50:02 UTC (121 KB)
[v2] Sun, 11 Feb 2018 21:55:39 UTC (315 KB)
[v3] Wed, 6 Jun 2018 20:52:51 UTC (1,143 KB)

Computer Science > Computation and Language

Title:Generalization without systematicity: On the compositional skills of sequence-to-sequence recurrent networks

Submission history

Access Paper:

References & Citations

2 blog links

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Generalization without systematicity: On the compositional skills of sequence-to-sequence recurrent networks

Submission history

Access Paper:

References & Citations

2 blog links

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators