Predicting and interpreting embeddings for out of vocabulary words in downstream tasks

Garneau, Nicolas; Leboeuf, Jean-Samuel; Lamontagne, Luc

Computer Science > Computation and Language

arXiv:1903.00724 (cs)

[Submitted on 2 Mar 2019]

Title:Predicting and interpreting embeddings for out of vocabulary words in downstream tasks

Authors:Nicolas Garneau, Jean-Samuel Leboeuf, Luc Lamontagne

View PDF

Abstract:We propose a novel way to handle out of vocabulary (OOV) words in downstream natural language processing (NLP) tasks. We implement a network that predicts useful embeddings for OOV words based on their morphology and on the context in which they appear. Our model also incorporates an attention mechanism indicating the focus allocated to the left context words, the right context words or the word's characters, hence making the prediction more interpretable. The model is a ``drop-in'' module that is jointly trained with the downstream task's neural network, thus producing embeddings specialized for the task at hand. When the task is mostly syntactical, we observe that our model aims most of its attention on surface form characters. On the other hand, for tasks more semantical, the network allocates more attention to the surrounding words. In all our tests, the module helps the network to achieve better performances in comparison to the use of simple random embeddings.

Comments:	2 pages, 0 figures, 2 tables
Subjects:	Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as:	arXiv:1903.00724 [cs.CL]
	(or arXiv:1903.00724v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.1903.00724
Journal reference:	Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP

Submission history

From: Jean-Samuel Leboeuf [view email]
[v1] Sat, 2 Mar 2019 15:32:39 UTC (35 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2019-03

Change to browse by:

cs
cs.LG

References & Citations

DBLP - CS Bibliography

listing | bibtex

Nicolas Garneau
Jean-Samuel Leboeuf
Luc Lamontagne

export BibTeX citation

Computer Science > Computation and Language

Title:Predicting and interpreting embeddings for out of vocabulary words in downstream tasks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Predicting and interpreting embeddings for out of vocabulary words in downstream tasks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators