Faster Inference of Integer SWIN Transformer by Removing the GELU Activation

Tayaranian, Mohammadreza; Mozafari, Seyyed Hasan; Clark, James J.; Meyer, Brett; Gross, Warren

Computer Science > Computer Vision and Pattern Recognition

arXiv:2402.01169 (cs)

[Submitted on 2 Feb 2024]

Title:Faster Inference of Integer SWIN Transformer by Removing the GELU Activation

Authors:Mohammadreza Tayaranian, Seyyed Hasan Mozafari, James J. Clark, Brett Meyer, Warren Gross

View PDF HTML (experimental)

Abstract:SWIN transformer is a prominent vision transformer model that has state-of-the-art accuracy in image classification tasks. Despite this success, its unique architecture causes slower inference compared with similar deep neural networks. Integer quantization of the model is one of the methods used to improve its inference latency. However, state-of-the-art has not been able to fully quantize the model. In this work, we improve upon the inference latency of the state-of-the-art methods by removing the floating-point operations, which are associated with the GELU activation in Swin Transformer. While previous work proposed to replace the non-integer operations with linear approximation functions, we propose to replace GELU with ReLU activation. The advantage of ReLU over previous methods is its low memory and computation complexity. We use iterative knowledge distillation to compensate for the lost accuracy due to replacing GELU with ReLU. We quantize our GELU-less SWIN transformer and show that on an RTX 4090 NVIDIA GPU we can improve the inference latency of the quantized SWIN transformer by at least $11\%$ while maintaining an accuracy drop of under $0.5\%$ on the ImageNet evaluation dataset.

Comments:	5 pages, 1 figure. Submitted to Edge Intelligence Workshop III, an AAAI 2024 workshop
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2402.01169 [cs.CV]
	(or arXiv:2402.01169v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2402.01169

Submission history

From: Sayed Mohammadreza Tayaranian Hosseini [view email]
[v1] Fri, 2 Feb 2024 06:23:00 UTC (23 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Faster Inference of Integer SWIN Transformer by Removing the GELU Activation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Faster Inference of Integer SWIN Transformer by Removing the GELU Activation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators