MEMORY-VQ: Compression for Tractable Internet-Scale Memory

Zemlyanskiy, Yury; de Jong, Michiel; Vilnis, Luke; Ontañón, Santiago; Cohen, William W.; Sanghai, Sumit; Ainslie, Joshua

Computer Science > Computation and Language

arXiv:2308.14903 (cs)

[Submitted on 28 Aug 2023]

Title:MEMORY-VQ: Compression for Tractable Internet-Scale Memory

Authors:Yury Zemlyanskiy, Michiel de Jong, Luke Vilnis, Santiago Ontañón, William W. Cohen, Sumit Sanghai, Joshua Ainslie

View PDF

Abstract:Retrieval augmentation is a powerful but expensive method to make language models more knowledgeable about the world. Memory-based methods like LUMEN pre-compute token representations for retrieved passages to drastically speed up inference. However, memory also leads to much greater storage requirements from storing pre-computed representations.
We propose MEMORY-VQ, a new method to reduce storage requirements of memory-augmented models without sacrificing performance. Our method uses a vector quantization variational autoencoder (VQ-VAE) to compress token representations. We apply MEMORY-VQ to the LUMEN model to obtain LUMEN-VQ, a memory model that achieves a 16x compression rate with comparable performance on the KILT benchmark. LUMEN-VQ enables practical retrieval augmentation even for extremely large retrieval corpora.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2308.14903 [cs.CL]
	(or arXiv:2308.14903v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2308.14903

Submission history

From: Yury Zemlyanskiy [view email]
[v1] Mon, 28 Aug 2023 21:11:18 UTC (34 KB)

Computer Science > Computation and Language

Title:MEMORY-VQ: Compression for Tractable Internet-Scale Memory

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:MEMORY-VQ: Compression for Tractable Internet-Scale Memory

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators