Skip to main content

Showing 1–1 of 1 results for author: Azeez, M A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.18952  [pdf, ps, other

    cs.LG cs.AI cs.CL

    LLMs on a Budget? Say HOLA

    Authors: Zohaib Hasan Siddiqui, Jiechao Gao, Ebad Shabbir, Mohammad Anas Azeez, Rafiq Ali, Gautam Siddharth Kashyap, Usman Naseem

    Abstract: Running Large Language Models (LLMs) on edge devices is constrained by high compute and memory demands posing a barrier for real-time applications in sectors like healthcare, education, and embedded systems. Current solutions such as quantization, pruning, and retrieval-augmented generation (RAG) offer only partial optimizations and often compromise on speed or accuracy. We introduce HOLA, an end-… ▽ More

    Submitted 23 June, 2025; originally announced June 2025.