-
Layer-wise Quantization for Quantized Optimistic Dual Averaging
Authors:
Anh Duc Nguyen,
Ilia Markov,
Frank Zhengqing Wu,
Ali Ramezani-Kebrya,
Kimon Antonakopoulos,
Dan Alistarh,
Volkan Cevher
Abstract:
Modern deep neural networks exhibit heterogeneity across numerous layers of various types such as residuals, multi-head attention, etc., due to varying structures (dimensions, activation functions, etc.), distinct representation characteristics, which impact predictions. We develop a general layer-wise quantization framework with tight variance and code-length bounds, adapting to the heterogeneiti…
▽ More
Modern deep neural networks exhibit heterogeneity across numerous layers of various types such as residuals, multi-head attention, etc., due to varying structures (dimensions, activation functions, etc.), distinct representation characteristics, which impact predictions. We develop a general layer-wise quantization framework with tight variance and code-length bounds, adapting to the heterogeneities over the course of training. We then apply a new layer-wise quantization technique within distributed variational inequalities (VIs), proposing a novel Quantized Optimistic Dual Averaging (QODA) algorithm with adaptive learning rates, which achieves competitive convergence rates for monotone VIs. We empirically show that QODA achieves up to a $150\%$ speedup over the baselines in end-to-end training time for training Wasserstein GAN on $12+$ GPUs.
△ Less
Submitted 20 May, 2025;
originally announced May 2025.
-
On Partial Optimal Transport: Revising the Infeasibility of Sinkhorn and Efficient Gradient Methods
Authors:
Anh Duc Nguyen,
Tuan Dung Nguyen,
Quang Minh Nguyen,
Hoang H. Nguyen,
Lam M. Nguyen,
Kim-Chuan Toh
Abstract:
This paper studies the Partial Optimal Transport (POT) problem between two unbalanced measures with at most $n$ supports and its applications in various AI tasks such as color transfer or domain adaptation. There is hence the need for fast approximations of POT with increasingly large problem sizes in arising applications. We first theoretically and experimentally investigate the infeasibility of…
▽ More
This paper studies the Partial Optimal Transport (POT) problem between two unbalanced measures with at most $n$ supports and its applications in various AI tasks such as color transfer or domain adaptation. There is hence the need for fast approximations of POT with increasingly large problem sizes in arising applications. We first theoretically and experimentally investigate the infeasibility of the state-of-the-art Sinkhorn algorithm for POT due to its incompatible rounding procedure, which consequently degrades its qualitative performance in real world applications like point-cloud registration. To this end, we propose a novel rounding algorithm for POT, and then provide a feasible Sinkhorn procedure with a revised computation complexity of $\mathcal{\widetilde O}(n^2/\varepsilon^4)$. Our rounding algorithm also permits the development of two first-order methods to approximate the POT problem. The first algorithm, Adaptive Primal-Dual Accelerated Gradient Descent (APDAGD), finds an $\varepsilon$-approximate solution to the POT problem in $\mathcal{\widetilde O}(n^{2.5}/\varepsilon)$, which is better in $\varepsilon$ than revised Sinkhorn. The second method, Dual Extrapolation, achieves the computation complexity of $\mathcal{\widetilde O}(n^2/\varepsilon)$, thereby being the best in the literature. We further demonstrate the flexibility of POT compared to standard OT as well as the practicality of our algorithms on real applications where two marginal distributions are unbalanced.
△ Less
Submitted 22 December, 2023; v1 submitted 21 December, 2023;
originally announced December 2023.
-
Existence of a maximal solution of singular parabolic equations with absorptions: quenching phenomenon and the instantaneous shrinking phenomenon
Authors:
Anh Dao Nguyen
Abstract:
This paper deals with nonnegative solutions of the one dimensional degenerate parabolic equations with zero homogeneous Dirichlet boundary condition. To obtain an existence result, we prove a sharp gradient estimate of |u_x|. Besides, we investigate the behaviors of nonnegative solutions such as the quenching phenomenon, and the finite speed of propagation. Our results of the Dirichlet problem wil…
▽ More
This paper deals with nonnegative solutions of the one dimensional degenerate parabolic equations with zero homogeneous Dirichlet boundary condition. To obtain an existence result, we prove a sharp gradient estimate of |u_x|. Besides, we investigate the behaviors of nonnegative solutions such as the quenching phenomenon, and the finite speed of propagation. Our results of the Dirichlet problem will be extended to the associated Cauchy problem. In addition, we show that the phenomenon of the instantaneous shrinking of compact support of the nonnegative solutions occurs if f satisfies some growth condition.
△ Less
Submitted 9 April, 2015;
originally announced April 2015.