Skip to main content

Showing 1–7 of 7 results for author: Ghafouri, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.09397  [pdf, ps, other

    cs.DC cs.AI cs.LG cs.NI

    SLED: A Speculative LLM Decoding Framework for Efficient Edge Serving

    Authors: Xiangchen Li, Dimitrios Spatharakis, Saeid Ghafouri, Jiakun Fan, Dimitrios Nikolopoulos

    Abstract: Regardless the advancements in device capabilities, efficient inferencing advanced large language models (LLMs) at the edge remains challenging due to limited device memory and power constraints. Existing strategies, such as aggressive quantization, pruning, or remote inference, trade accuracy for efficiency or lead to substantial cost burdens. This position paper introduces a new approach that le… ▽ More

    Submitted 11 June, 2025; originally announced June 2025.

    Comments: 6 pages, 9 figures, 2 tables

    MSC Class: 68T07; 68M14 ACM Class: I.2.6; C.2.4; C.1.4

  2. Sponge: Inference Serving with Dynamic SLOs Using In-Place Vertical Scaling

    Authors: Kamran Razavi, Saeid Ghafouri, Max Mühlhäuser, Pooyan Jamshidi, Lin Wang

    Abstract: Mobile and IoT applications increasingly adopt deep learning inference to provide intelligence. Inference requests are typically sent to a cloud infrastructure over a wireless network that is highly variable, leading to the challenge of dynamic Service Level Objectives (SLOs) at the request level. This paper presents Sponge, a novel deep learning inference serving system that maximizes resource ef… ▽ More

    Submitted 23 April, 2024; v1 submitted 31 March, 2024; originally announced April 2024.

  3. arXiv:2312.03235  [pdf, other

    cs.DC cs.PF

    HEET: A Heterogeneity Measure to Quantify the Difference across Distributed Computing Systems

    Authors: Ali Mokhtari, Saeid Ghafouri, Pooyan Jamshidi, Mohsen Amini Salehi

    Abstract: Although system heterogeneity has been extensively studied in the past, there is yet to be a study on measuring the impact of heterogeneity on system performance. For this purpose, we propose a heterogeneity measure that can characterize the impact of the heterogeneity of a system on its performance behavior in terms of throughput or makespan. We develop a mathematical model to characterize a hete… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

  4. arXiv:2308.12871  [pdf, other

    cs.DC cs.LG cs.PF

    IPA: Inference Pipeline Adaptation to Achieve High Accuracy and Cost-Efficiency

    Authors: Saeid Ghafouri, Kamran Razavi, Mehran Salmani, Alireza Sanaee, Tania Lorido-Botran, Lin Wang, Joseph Doyle, Pooyan Jamshidi

    Abstract: Efficiently optimizing multi-model inference pipelines for fast, accurate, and cost-effective inference is a crucial challenge in machine learning production systems, given their tight end-to-end latency requirements. To simplify the exploration of the vast and intricate trade-off space of latency, accuracy, and cost in inference pipelines, providers frequently opt to consider one of them. However… ▽ More

    Submitted 26 May, 2024; v1 submitted 24 August, 2023; originally announced August 2023.

    Journal ref: Journal of Systems Research, 4(1) (2024)

  5. arXiv:2304.10892  [pdf, other

    cs.LG cs.DC eess.SY

    Reconciling High Accuracy, Cost-Efficiency, and Low Latency of Inference Serving Systems

    Authors: Mehran Salmani, Saeid Ghafouri, Alireza Sanaee, Kamran Razavi, Max Mühlhäuser, Joseph Doyle, Pooyan Jamshidi, Mohsen Sharifi

    Abstract: The use of machine learning (ML) inference for various applications is growing drastically. ML inference services engage with users directly, requiring fast and accurate responses. Moreover, these services face dynamic workloads of requests, imposing changes in their computing resources. Failing to right-size computing resources results in either latency service level objectives (SLOs) violations… ▽ More

    Submitted 24 April, 2023; v1 submitted 21 April, 2023; originally announced April 2023.

  6. arXiv:2208.13166  [pdf

    cs.SI cs.AI

    Influence Maximization (IM) in Complex Networks with Limited Visibility Using Statistical Methods

    Authors: Saeid Ghafouri, Seyed Hossein Khasteh, Seyed Omid Azarkasb

    Abstract: A social network (SN) is a social structure consisting of a group representing the interaction between them. SNs have recently been widely used and, subsequently, have become suitable and popular platforms for product promotion and information diffusion. People in an SN directly influence each other's interests and behavior. One of the most important problems in SNs is to find people who can have… ▽ More

    Submitted 11 September, 2022; v1 submitted 28 August, 2022; originally announced August 2022.

  7. arXiv:2208.13161  [pdf

    cs.SI cs.AI

    Opinion Leader Detection in Online Social Networks Based on Output and Input Links

    Authors: Zahra Ghorbani, Seyed Hossein Khasteh, Saeid Ghafouri

    Abstract: The understanding of how users in a network update their opinions based on their neighbours opinions has attracted a great deal of interest in the field of network science, and a growing body of literature recognises the significance of this issue. In this research paper, we propose a new dynamic model of opinion formation in directed networks. In this model, the opinion of each node is updated as… ▽ More

    Submitted 28 August, 2022; originally announced August 2022.