Skip to main content

Showing 1–1 of 1 results for author: Wiest, L

Searching in archive cs. Search in all archives.
.
  1. arXiv:2411.03934  [pdf, other

    cs.LG cs.AI cs.CL

    Interactions Across Blocks in Post-Training Quantization of Large Language Models

    Authors: Khasmamad Shabanovi, Lukas Wiest, Vladimir Golkov, Daniel Cremers, Thomas Pfeil

    Abstract: Post-training quantization is widely employed to reduce the computational demands of neural networks. Typically, individual substructures, such as layers or blocks of layers, are quantized with the objective of minimizing quantization errors in their pre-activations by fine-tuning the corresponding weights. Deriving this local objective from the global objective of minimizing task loss involves tw… ▽ More

    Submitted 6 November, 2024; originally announced November 2024.