Skip to main content

Showing 1–4 of 4 results for author: Benazir, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2508.08531  [pdf, ps, other

    cs.PF

    Profiling Large Language Model Inference on Apple Silicon: A Quantization Perspective

    Authors: Afsara Benazir, Felix Xiaozhu Lin

    Abstract: A systematic understanding of Apple Silicon is lacking in the current landscape of hardware efficiency; research focus is largely centered on accelerating GPUs for large-scale training or inference on CUDA devices. This paper investigates Apple Silicon's unique memory architecture that offers a unified memory integrating CPU and GPU memory and its implications for on-device LLM inference. We dec… ▽ More

    Submitted 11 August, 2025; originally announced August 2025.

  2. arXiv:2504.17984  [pdf, other

    cs.OS cs.SE

    A Journey of Modern OS Construction From boot to DOOM

    Authors: Wonkyo Choe, Rongxiang Wang, Afsara Benazir, Felix Xiaozhu Lin

    Abstract: VOS is a first-of-its-kind instructional OS that: (1) Runs on commodity, portable hardware. (2) Showcases modern features, including per-app address spaces, threading, commodity filesystems, USB, DMA, multicore, self-hosted debugging, and a window manager. (3) Supports rich applications such as 2D/3D games, music and video players, and a blockchain miner. Unlike traditional instructional systems,… ▽ More

    Submitted 24 April, 2025; originally announced April 2025.

  3. arXiv:2502.01649  [pdf, other

    eess.AS cs.LG cs.SD

    Privacy-Preserving Edge Speech Understanding with Tiny Foundation Models

    Authors: Afsara Benazir, Felix Xiaozhu Lin

    Abstract: Robust speech recognition systems rely on cloud service providers for inference. It needs to ensure that an untrustworthy provider cannot deduce the sensitive content in speech. Sanitization can be done on speech content keeping in mind that it has to avoid compromising transcription accuracy. Realizing the under utilized capabilities of tiny speech foundation models (FMs), for the first time, we… ▽ More

    Submitted 29 January, 2025; originally announced February 2025.

  4. arXiv:2311.18188  [pdf, other

    eess.AS cs.LG

    Speech Understanding on Tiny Devices with A Learning Cache

    Authors: Afsara Benazir, Zhiming Xu, Felix Xiaozhu Lin

    Abstract: This paper addresses spoken language understanding (SLU) on microcontroller-like embedded devices, integrating on-device execution with cloud offloading in a novel fashion. We leverage temporal locality in the speech inputs to a device and reuse recent SLU inferences accordingly. Our idea is simple: let the device match incoming inputs against cached results, and only offload inputs not matched to… ▽ More

    Submitted 8 May, 2024; v1 submitted 29 November, 2023; originally announced November 2023.

    Comments: accepted at MobiSys'24