Second-Order Word Embeddings from Nearest Neighbor Topological Features

Newman-Griffis, Denis; Fosler-Lussier, Eric

Computer Science > Computation and Language

arXiv:1705.08488v1 (cs)

[Submitted on 23 May 2017]

Title:Second-Order Word Embeddings from Nearest Neighbor Topological Features

Authors:Denis Newman-Griffis, Eric Fosler-Lussier

View PDF

Abstract:We introduce second-order vector representations of words, induced from nearest neighborhood topological features in pre-trained contextual word embeddings. We then analyze the effects of using second-order embeddings as input features in two deep natural language processing models, for named entity recognition and recognizing textual entailment, as well as a linear model for paraphrase recognition. Surprisingly, we find that nearest neighbor information alone is sufficient to capture most of the performance benefits derived from using pre-trained word embeddings. Furthermore, second-order embeddings are able to handle highly heterogeneous data better than first-order representations, though at the cost of some specificity. Additionally, augmenting contextual embeddings with second-order information further improves model performance in some cases. Due to variance in the random initializations of word embeddings, utilizing nearest neighbor features from multiple first-order embedding samples can also contribute to downstream performance gains. Finally, we identify intriguing characteristics of second-order embedding spaces for further research, including much higher density and different semantic interpretations of cosine similarity.

Comments:	Submitted to NIPS 2017. (8 pages + 4 reference)
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:1705.08488 [cs.CL]
	(or arXiv:1705.08488v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.1705.08488

Submission history

From: Denis Newman-Griffis [view email]
[v1] Tue, 23 May 2017 19:12:05 UTC (36 KB)

Computer Science > Computation and Language

Title:Second-Order Word Embeddings from Nearest Neighbor Topological Features

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Second-Order Word Embeddings from Nearest Neighbor Topological Features

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators