Online Embedding Compression for Text Classification using Low Rank Matrix Factorization

Acharya, Anish; Goel, Rahul; Metallinou, Angeliki; Dhillon, Inderjit

Computer Science > Machine Learning

arXiv:1811.00641 (cs)

[Submitted on 1 Nov 2018]

Title:Online Embedding Compression for Text Classification using Low Rank Matrix Factorization

Authors:Anish Acharya, Rahul Goel, Angeliki Metallinou, Inderjit Dhillon

View PDF

Abstract:Deep learning models have become state of the art for natural language processing (NLP) tasks, however deploying these models in production system poses significant memory constraints. Existing compression methods are either lossy or introduce significant latency. We propose a compression method that leverages low rank matrix factorization during training,to compress the word embedding layer which represents the size bottleneck for most NLP models. Our models are trained, compressed and then further re-trained on the downstream task to recover accuracy while maintaining the reduced size. Empirically, we show that the proposed method can achieve 90% compression with minimal impact in accuracy for sentence classification tasks, and outperforms alternative methods like fixed-point quantization or offline word embedding compression. We also analyze the inference time and storage space for our method through FLOP calculations, showing that we can compress DNN models by a configurable ratio and regain accuracy loss without introducing additional latency compared to fixed point quantization. Finally, we introduce a novel learning rate schedule, the Cyclically Annealed Learning Rate (CALR), which we empirically demonstrate to outperform other popular adaptive learning rate algorithms on a sentence classification benchmark.

Comments:	Accepted in Thirty-Third AAAI Conference on Artificial Intelligence (AAAI 2019)
Subjects:	Machine Learning (cs.LG); Computation and Language (cs.CL); Numerical Analysis (math.NA); Machine Learning (stat.ML)
Cite as:	arXiv:1811.00641 [cs.LG]
	(or arXiv:1811.00641v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1811.00641

Submission history

From: Anish Acharya [view email]
[v1] Thu, 1 Nov 2018 21:38:18 UTC (1,350 KB)

Computer Science > Machine Learning

Title:Online Embedding Compression for Text Classification using Low Rank Matrix Factorization

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Online Embedding Compression for Text Classification using Low Rank Matrix Factorization

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators