TRANS-BLSTM: Transformer with Bidirectional LSTM for Language Understanding

Huang, Zhiheng; Xu, Peng; Liang, Davis; Mishra, Ajay; Xiang, Bing

Computer Science > Computation and Language

arXiv:2003.07000 (cs)

[Submitted on 16 Mar 2020]

Title:TRANS-BLSTM: Transformer with Bidirectional LSTM for Language Understanding

Authors:Zhiheng Huang, Peng Xu, Davis Liang, Ajay Mishra, Bing Xiang

View PDF

Abstract:Bidirectional Encoder Representations from Transformers (BERT) has recently achieved state-of-the-art performance on a broad range of NLP tasks including sentence classification, machine translation, and question answering. The BERT model architecture is derived primarily from the transformer. Prior to the transformer era, bidirectional Long Short-Term Memory (BLSTM) has been the dominant modeling architecture for neural machine translation and question answering. In this paper, we investigate how these two modeling techniques can be combined to create a more powerful model architecture. We propose a new architecture denoted as Transformer with BLSTM (TRANS-BLSTM) which has a BLSTM layer integrated to each transformer block, leading to a joint modeling framework for transformer and BLSTM. We show that TRANS-BLSTM models consistently lead to improvements in accuracy compared to BERT baselines in GLUE and SQuAD 1.1 experiments. Our TRANS-BLSTM model obtains an F1 score of 94.01% on the SQuAD 1.1 development dataset, which is comparable to the state-of-the-art result.

Subjects:	Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2003.07000 [cs.CL]
	(or arXiv:2003.07000v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2003.07000

Submission history

From: Zhiheng Huang [view email]
[v1] Mon, 16 Mar 2020 03:38:51 UTC (183 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2020-03

Change to browse by:

cs
cs.LG
cs.SD
eess
eess.AS

References & Citations

DBLP - CS Bibliography

listing | bibtex

Zhiheng Huang
Peng Xu
Davis Liang
Bing Xiang

export BibTeX citation

Computer Science > Computation and Language

Title:TRANS-BLSTM: Transformer with Bidirectional LSTM for Language Understanding

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:TRANS-BLSTM: Transformer with Bidirectional LSTM for Language Understanding

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators