Compressing LSTM Networks by Matrix Product Operators

Gao, Ze-Feng; Sun, Xingwei; Gao, Lan; Li, Junfeng; Lu, Zhong-Yi

Computer Science > Networking and Internet Architecture

arXiv:2012.11943v2 (cs)

[Submitted on 22 Dec 2020 (v1), revised 4 Jan 2021 (this version, v2), latest version 31 Mar 2022 (v3)]

Title:Compressing LSTM Networks by Matrix Product Operators

Authors:Ze-Feng Gao, Xingwei Sun, Lan Gao, Junfeng Li, Zhong-Yi Lu

View PDF

Abstract:Long Short-Term Memory (LSTM) models are the building blocks of many state-of-the-art algorithms for Natural Language Processing (NLP). But, there are a large number of parameters in an LSTM model. This usually brings out a large amount of memory space needed for operating an LSTM model. Thus, an LSTM model usually requires a large amount of computational resources for training and predicting new data, suffering from computational inefficiencies. Here we propose an alternative LSTM model to reduce the number of parameters significantly by representing the weight parameters based on matrix product operators (MPO), which are used to characterize the local correlation in quantum states in physics. We further experimentally compare the compressed models based the MPO-LSTM model and the pruning method on sequence classification and sequence prediction tasks. The experimental results show that our proposed MPO-based method outperforms the pruning method.

Comments:	2 figures, 5 tables
Subjects:	Networking and Internet Architecture (cs.NI); Computational Physics (physics.comp-ph); Quantum Physics (quant-ph)
Cite as:	arXiv:2012.11943 [cs.NI]
	(or arXiv:2012.11943v2 [cs.NI] for this version)
	https://doi.org/10.48550/arXiv.2012.11943

Submission history

From: Ze-Feng Gao [view email]
[v1] Tue, 22 Dec 2020 11:50:06 UTC (186 KB)
[v2] Mon, 4 Jan 2021 10:48:12 UTC (141 KB)
[v3] Thu, 31 Mar 2022 05:54:05 UTC (143 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.NI

< prev | next >

new | recent | 2020-12

Change to browse by:

cs
physics
physics.comp-ph
quant-ph

References & Citations

DBLP - CS Bibliography

listing | bibtex

Ze-Feng Gao
Junfeng Li
Zhong-Yi Lu

export BibTeX citation

Computer Science > Networking and Internet Architecture

Title:Compressing LSTM Networks by Matrix Product Operators

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Networking and Internet Architecture

Title:Compressing LSTM Networks by Matrix Product Operators

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators