Skip to main content

Showing 1–1 of 1 results for author: Mithassel, B A

.
  1. arXiv:2504.03360  [pdf, other

    cs.CY cs.AI cs.CL cs.LG

    Sustainable LLM Inference for Edge AI: Evaluating Quantized LLMs for Energy Efficiency, Output Accuracy, and Inference Latency

    Authors: Erik Johannes Husom, Arda Goknil, Merve Astekin, Lwin Khin Shar, Andre Kåsen, Sagar Sen, Benedikt Andreas Mithassel, Ahmet Soylu

    Abstract: Deploying Large Language Models (LLMs) on edge devices presents significant challenges due to computational constraints, memory limitations, inference speed, and energy consumption. Model quantization has emerged as a key technique to enable efficient LLM inference by reducing model size and computational overhead. In this study, we conduct a comprehensive analysis of 28 quantized LLMs from the Ol… ▽ More

    Submitted 4 April, 2025; originally announced April 2025.

    Comments: 30 pages, 14 figures