Skip to main content

Showing 1–2 of 2 results for author: Graham, J J

Searching in archive cs. Search in all archives.
.
  1. Real-Time In-Network Machine Learning on P4-Programmable FPGA SmartNICs with Fixed-Point Arithmetic and Taylor

    Authors: Mohammad Firas Sada, John J. Graham, Mahidhar Tatineni, Dmitry Mishin, Thomas A. DeFanti, Frank Würthwein

    Abstract: As machine learning (ML) applications become integral to modern network operations, there is an increasing demand for network programmability that enables low-latency ML inference for tasks such as Quality of Service (QoS) prediction and anomaly detection in cybersecurity. ML models provide adaptability through dynamic weight adjustments, making Programming Protocol-independent Packet Processors (… ▽ More

    Submitted 1 July, 2025; originally announced July 2025.

    Comments: To appear in Proceedings of the Practice and Experience in Advanced Research Computing (PEARC25)

    Journal ref: Proceedings of the Practice and Experience in Advanced Research Computing PEARC '25, July 20-24, 2025, Columbus, OH, USA

  2. Serving LLMs in HPC Clusters: A Comparative Study of Qualcomm Cloud AI 100 Ultra and High-Performance GPUs

    Authors: Mohammad Firas Sada, John J. Graham, Elham E Khoda, Mahidhar Tatineni, Dmitry Mishin, Rajesh K. Gupta, Rick Wagner, Larry Smarr, Thomas A. DeFanti, Frank Würthwein

    Abstract: This study presents a benchmarking analysis of the Qualcomm Cloud AI 100 Ultra (QAic) accelerator for large language model (LLM) inference, evaluating its energy efficiency (throughput per watt) and performance against leading NVIDIA (A100, H200) and AMD (MI300A) GPUs within the National Research Platform (NRP) ecosystem. A total of 15 open-source LLMs, ranging from 117 million to 90 billion param… ▽ More

    Submitted 1 July, 2025; originally announced July 2025.

    Comments: To appear in Proceedings of the Practice and Experience in Advanced Research Computing (PEARC '25)

    Journal ref: Proceedings of the Practice and Experience in Advanced Research Computing PEARC25 2025