Straightening Out the Straight-Through Estimator: Overcoming Optimization Challenges in Vector Quantized Networks

Huh, Minyoung; Cheung, Brian; Agrawal, Pulkit; Isola, Phillip

Computer Science > Machine Learning

arXiv:2305.08842 (cs)

[Submitted on 15 May 2023]

Title:Straightening Out the Straight-Through Estimator: Overcoming Optimization Challenges in Vector Quantized Networks

Authors:Minyoung Huh, Brian Cheung, Pulkit Agrawal, Phillip Isola

View PDF

Abstract:This work examines the challenges of training neural networks using vector quantization using straight-through estimation. We find that a primary cause of training instability is the discrepancy between the model embedding and the code-vector distribution. We identify the factors that contribute to this issue, including the codebook gradient sparsity and the asymmetric nature of the commitment loss, which leads to misaligned code-vector assignments. We propose to address this issue via affine re-parameterization of the code vectors. Additionally, we introduce an alternating optimization to reduce the gradient error introduced by the straight-through estimation. Moreover, we propose an improvement to the commitment loss to ensure better alignment between the codebook representation and the model embedding. These optimization methods improve the mathematical approximation of the straight-through estimation and, ultimately, the model performance. We demonstrate the effectiveness of our methods on several common model architectures, such as AlexNet, ResNet, and ViT, across various tasks, including image classification and generative modeling.

Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2305.08842 [cs.LG]
	(or arXiv:2305.08842v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2305.08842

Submission history

From: Minyoung Huh [view email]
[v1] Mon, 15 May 2023 17:56:36 UTC (4,076 KB)

Computer Science > Machine Learning

Title:Straightening Out the Straight-Through Estimator: Overcoming Optimization Challenges in Vector Quantized Networks

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Straightening Out the Straight-Through Estimator: Overcoming Optimization Challenges in Vector Quantized Networks

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators