Skip to main content

Showing 1–1 of 1 results for author: Wu, F Z

Searching in archive math. Search in all archives.
.
  1. arXiv:2505.14371  [pdf, ps, other

    cs.LG math.OC

    Layer-wise Quantization for Quantized Optimistic Dual Averaging

    Authors: Anh Duc Nguyen, Ilia Markov, Frank Zhengqing Wu, Ali Ramezani-Kebrya, Kimon Antonakopoulos, Dan Alistarh, Volkan Cevher

    Abstract: Modern deep neural networks exhibit heterogeneity across numerous layers of various types such as residuals, multi-head attention, etc., due to varying structures (dimensions, activation functions, etc.), distinct representation characteristics, which impact predictions. We develop a general layer-wise quantization framework with tight variance and code-length bounds, adapting to the heterogeneiti… ▽ More

    Submitted 20 May, 2025; originally announced May 2025.

    Comments: Accepted at the International Conference on Machine Learning (ICML 2025)