Search | arXiv e-print repository

Benchmarking Different Application Types across Heterogeneous Cloud Compute Services

Authors: Nivedhitha Duggi, Masoud Rafiei, Mohsen Amini Salehi

Abstract: Infrastructure as a Service (IaaS) clouds have become the predominant underlying infrastructure for the operation of modern and smart technology. IaaS clouds have proven to be useful for multiple reasons such as reduced costs, increased speed and efficiency, and better reliability and scalability. Compute services offered by such clouds are heterogeneous -- they offer a set of architecturally dive… ▽ More Infrastructure as a Service (IaaS) clouds have become the predominant underlying infrastructure for the operation of modern and smart technology. IaaS clouds have proven to be useful for multiple reasons such as reduced costs, increased speed and efficiency, and better reliability and scalability. Compute services offered by such clouds are heterogeneous -- they offer a set of architecturally diverse machines that fit efficiently executing different workloads. However, there has been little study to shed light on the performance of popular application types on these heterogeneous compute servers across different clouds. Such a study can help organizations to optimally (in terms of cost, latency, throughput, consumed energy, carbon footprint, etc.) employ cloud compute services. At HPCC lab, we have focused on such benchmarks in different research projects and, in this report, we curate those benchmarks in a single document to help other researchers in the community using them. Specifically, we introduce our benchmarks datasets for three application types in three different domains, namely: Deep Neural Networks (DNN) Inference for industrial applications, Machine Learning (ML) Inference for assistive technology applications, and video transcoding for multimedia use cases. △ Less

Submitted 10 January, 2025; originally announced January 2025.

Comments: Technical Report. arXiv admin note: text overlap with arXiv:2011.11711 by other authors

arXiv:2411.19487 [pdf, other]

HE2C: A Holistic Approach for Allocating Latency-Sensitive AI Tasks across Edge-Cloud

Authors: Minseo Kim, Wei Shu, Mohsen Amini Salehi

Abstract: The high computational, memory, and energy demands of Deep Learning (DL) applications often exceed the capabilities of battery-powered edge devices, creating difficulties in meeting task deadlines and accuracy requirements. Unlike previous solutions that optimize a single metric (e.g., accuracy or energy efficiency), HE2C framework is designed to holistically address the latency, memory, accuracy,… ▽ More The high computational, memory, and energy demands of Deep Learning (DL) applications often exceed the capabilities of battery-powered edge devices, creating difficulties in meeting task deadlines and accuracy requirements. Unlike previous solutions that optimize a single metric (e.g., accuracy or energy efficiency), HE2C framework is designed to holistically address the latency, memory, accuracy, throughput, and energy demands of DL applications across edge-cloud continuum, thereby, delivering a more comprehensive and effective user experience. HE2C comprises three key modules: (a) a "feasibility-check module that evaluates the likelihood of meeting deadlines across both edge and cloud resources; (b) a "resource allocation strategy" that maximizes energy efficiency without sacrificing the inference accuracy; and (c) a "rescue module" that enhances throughput by leveraging approximate computing to trade accuracy for latency when necessary. Our primary objective is to maximize system prolong battery lifespan, throughput, and accuracy while adhering to strict latency constraints. Experimental evaluations in the context of wearable technologies for blind and visually impaired users demonstrate that HE2C significantly improves task throughput via completing a larger number of tasks within their specified deadlines, while preserving edge device battery and maintaining prediction accuracy with minimal latency impact. These results underscore HE2C's potential as a robust solution for resource management in latency-sensitive, energy-constrained edge-to-cloud environments. △ Less

Submitted 29 November, 2024; originally announced November 2024.

Comments: Accepted in Utility Cloud Computing (UCC '24) Conference

arXiv:2411.19485 [pdf, other]

Action Engine: An LLM-based Framework for Automatic FaaS Workflow Generation

Authors: Akiharu Esashi, Pawissanutt Lertpongrujikorn, Mohsen Amini Salehi

Abstract: Function as a Service (FaaS) is poised to become the foundation of the next generation of cloud systems due to its inherent advantages in scalability, cost-efficiency, and ease of use. However, challenges such as the need for specialized knowledge and difficulties in building function workflows persist for cloud-native application developers. To overcome these challenges and mitigate the burden of… ▽ More Function as a Service (FaaS) is poised to become the foundation of the next generation of cloud systems due to its inherent advantages in scalability, cost-efficiency, and ease of use. However, challenges such as the need for specialized knowledge and difficulties in building function workflows persist for cloud-native application developers. To overcome these challenges and mitigate the burden of developing FaaS-based applications, in this paper, we propose a mechanism called Action Engine, that makes use of Tool-Augmented Large Language Models (LLMs) at its kernel to interpret human language queries and automates FaaS workflow generation, thereby, reducing the need for specialized expertise and manual design. Action Engine includes modules to identify relevant functions from the FaaS repository and seamlessly manage the data dependency between them, ensuring that the developer's query is processed and resolved. Beyond that, Action Engine can execute the generated workflow by feeding the user-provided parameters. Our evaluations show that Action Engine can generate workflows with up to 20\% higher correctness without developer involvement. We notice that Action Engine can unlock FaaS workflow generation for non-cloud-savvy developers and expedite the development cycles of cloud-native applications. △ Less

Submitted 29 November, 2024; originally announced November 2024.

Comments: Accepted at Utility Cloud Computing (UCC '24) conference

arXiv:2410.16569 [pdf, other]

Streamlining Cloud-Native Application Development and Deployment with Robust Encapsulation

Authors: Pawissanutt Lertpongrujikorn, Hai Duc Nguyen, Mohsen Amini Salehi

Abstract: Current Serverless abstractions (e.g., FaaS) poorly support non-functional requirements (e.g., QoS and constraints), are provider-dependent, and are incompatible with other cloud abstractions (e.g., databases). As a result, application developers have to undergo numerous rounds of development and manual deployment refinements to finally achieve their desired quality and efficiency. In this paper,… ▽ More Current Serverless abstractions (e.g., FaaS) poorly support non-functional requirements (e.g., QoS and constraints), are provider-dependent, and are incompatible with other cloud abstractions (e.g., databases). As a result, application developers have to undergo numerous rounds of development and manual deployment refinements to finally achieve their desired quality and efficiency. In this paper, we present Object-as-a-Service (OaaS) -- a novel serverless paradigm that borrows the object-oriented programming concepts to encapsulate business logic, data, and non-functional requirements into a single deployment package, thereby streamlining provider-agnostic cloud-native application development. We also propose a declarative interface for the non-functional requirements of applications that relieves developers from daunting refinements to meet their desired QoS and deployment constraint targets. We realized the OaaS paradigm through a platform called Oparaca and evaluated it against various real-world applications and scenarios. The evaluation results demonstrate that Oparaca can enhance application performance by 60X and improve reliability by 50X through latency, throughput, and availability enforcement -- all with remarkably less development and deployment time and effort. △ Less

Submitted 21 October, 2024; originally announced October 2024.

Comments: Accepted at ACM Symposium of Cloud Computing (SoCC '24)

arXiv:2409.15802 [pdf, other]

A Multi-Level Approach for Class Imbalance Problem in Federated Learning for Remote Industry 4.0 Applications

Authors: Razin Farhan Hussain, Mohsen Amini Salehi

Abstract: Deep neural network (DNN) models are effective solutions for industry 4.0 applications (\eg oil spill detection, fire detection, anomaly detection). However, training a DNN network model needs a considerable amount of data collected from various sources and transferred to the central cloud server that can be expensive and sensitive to privacy. For instance, in the remote offshore oil field where n… ▽ More Deep neural network (DNN) models are effective solutions for industry 4.0 applications (\eg oil spill detection, fire detection, anomaly detection). However, training a DNN network model needs a considerable amount of data collected from various sources and transferred to the central cloud server that can be expensive and sensitive to privacy. For instance, in the remote offshore oil field where network connectivity is vulnerable, a federated fog environment can be a potential computing platform. Hence it is feasible to perform computation within the federation. On the contrary, performing a DNN model training using fog systems poses a security issue that the federated learning (FL) technique can resolve. In this case, the new challenge is the class imbalance problem that can be inherited in local data sets and can degrade the performance of the global model. Therefore, FL training needs to be performed considering the class imbalance problem locally. In addition, an efficient technique to select the relevant worker model needs to be adopted at the global level to increase the robustness of the global model. Accordingly, we utilize one of the suitable loss functions addressing the class imbalance in workers at the local level. In addition, we employ a dynamic threshold mechanism with user-defined worker's weight to efficiently select workers for aggregation that improve the global model's robustness. Finally, we perform an extensive empirical evaluation to explore the benefits of our solution and find up to 3-5% performance improvement than baseline federated learning methods. △ Less

Submitted 24 September, 2024; originally announced September 2024.

arXiv:2408.04898 [pdf, other]

Object as a Service: Simplifying Cloud-Native Development through Serverless Object Abstraction

Authors: Pawissanutt Lertpongrujikorn, Mohsen Amini Salehi

Abstract: The function-as-a-service (FaaS) paradigm is envisioned as the next generation of cloud computing systems that mitigate the burden for cloud-native application developers by abstracting them from cloud resource management. However, it does not deal with the application data aspects. As such, developers have to intervene and undergo the burden of managing the application data, often via separate cl… ▽ More The function-as-a-service (FaaS) paradigm is envisioned as the next generation of cloud computing systems that mitigate the burden for cloud-native application developers by abstracting them from cloud resource management. However, it does not deal with the application data aspects. As such, developers have to intervene and undergo the burden of managing the application data, often via separate cloud storage services. To further streamline cloud-native application development, in this work, we propose a new paradigm, known as Object as a Service (OaaS) that encapsulates application data and functions into the cloud object abstraction. OaaS relieves developers from resource and data management burden while offering built-in optimization features. Inspired by OOP, OaaS incorporates access modifiers and inheritance into the serverless paradigm that: (a) prevents developers from compromising the system via accidentally accessing underlying data; and (b) enables software reuse in cloud-native application development. Furthermore, OaaS natively supports dataflow semantics. It enables developers to define function workflows while transparently handling data navigation, synchronization, and parallelism issues. To establish the OaaS paradigm, we develop a platform named Oparaca that offers state abstraction for structured and unstructured data with consistency and fault-tolerant guarantees. We evaluated Oparaca under real-world settings against state-of-the-art platforms with respect to the imposed overhead, scalability, and ease of use. The results demonstrate that the object abstraction provided by OaaS can streamline flexible and scalable cloud-native application development with an insignificant overhead on the underlying serverless system. △ Less

Submitted 9 August, 2024; originally announced August 2024.

arXiv:2407.17391 [pdf, other]

Tutorial: Object as a Service (OaaS) Serverless Cloud Computing Paradigm

Authors: Pawissanutt Lertpongrujikorn, Mohsen Amini Salehi

Abstract: While the first generation of cloud computing systems mitigated the job of system administrators, the next generation of cloud computing systems is emerging to mitigate the burden for cloud developers -- facilitating the development of cloud-native applications. This paradigm shift is primarily happening by offering higher-level serverless abstractions, such as Function as a Service (FaaS). Althou… ▽ More While the first generation of cloud computing systems mitigated the job of system administrators, the next generation of cloud computing systems is emerging to mitigate the burden for cloud developers -- facilitating the development of cloud-native applications. This paradigm shift is primarily happening by offering higher-level serverless abstractions, such as Function as a Service (FaaS). Although FaaS has successfully abstracted developers from the cloud resource management details, it falls short in abstracting the management of both data (i.e., state) and the non-functional aspects, such as Quality of Service (QoS) requirements. The lack of such abstractions implies developer intervention and is counterproductive to the objective of mitigating the burden of cloud-native application development. To further streamline cloud-native application development, we present Object-as-a-Service (OaaS) -- a serverless paradigm that borrows the object-oriented programming concepts to encapsulate application logic and data in addition to non-functional requirements into a single deployment package, thereby streamlining provider-agnostic cloud-native application development. We realized the OaaS paradigm through the development of an open-source platform called Oparaca. In this tutorial, we will present the concept and design of the OaaS paradigm and its implementation -- the Oparaca platform. Then, we give a tutorial on developing and deploying the application on the Oparaca platform and discuss its benefits and its optimal configurations to avoid potential overheads. △ Less

Submitted 24 July, 2024; originally announced July 2024.

Journal ref: Proceedings of the 44th International Conference on Distributed Computing Systems Workshops (ICDCSW), Jersey City, New Jersey, July 2024

arXiv:2407.00313 [pdf, other]

FastMig: Leveraging FastFreeze to Establish Robust Service Liquidity in Cloud 2.0

Authors: Sorawit Manatura, Thanawat Chanikaphon, Chantana Chantrapornchai, Mohsen Amini Salehi

Abstract: Service liquidity across edge-to-cloud or multi-cloud will serve as the cornerstone of the next generation of cloud computing systems (Cloud 2.0). Provided that cloud-based services are predominantly containerized, an efficient and robust live container migration solution is required to accomplish service liquidity. In a nod to this growing requirement, in this research, we leverage FastFreeze, a… ▽ More Service liquidity across edge-to-cloud or multi-cloud will serve as the cornerstone of the next generation of cloud computing systems (Cloud 2.0). Provided that cloud-based services are predominantly containerized, an efficient and robust live container migration solution is required to accomplish service liquidity. In a nod to this growing requirement, in this research, we leverage FastFreeze, a popular platform for process checkpoint/restore within a container, and promote it to be a robust solution for end-to-end live migration of containerized services. In particular, we develop a new platform, called FastMig that proactively controls the checkpoint/restore operations of FastFreeze, thereby, allowing for robust live migration of containerized services via standard HTTP interfaces. The proposed platform introduces post-checkpointing and pre-restoration operations to enhance migration robustness. Notably, the pre-restoration operation includes containerized service startup options, enabling warm restoration and reducing the migration downtime. In addition, we develop a method to make FastFreeze robust against failures that commonly happen during the migration and even during the normal operation of a containerized service. Experimental results under real-world settings show that the migration downtime of a containerized service can be reduced by 30X compared to the situation where the original FastFreeze was deployed for the migration. Moreover, we demonstrate that FastMig and warm restoration method together can significantly mitigate the container startup overhead. Importantly, these improvements are achieved without any significant performance reduction and only incurs a small resource usage overhead, compared to the bare (\ie non-FastFreeze) containerized services. △ Less

Submitted 29 June, 2024; originally announced July 2024.

Comments: Published in IEEE Cloud '24 conference

arXiv:2401.07194 [pdf, other]

Resource Allocation of Industry 4.0 Micro-Service Applications across Serverless Fog Federation

Authors: Razin Farhan Hussain, Mohsen Amini Salehi

Abstract: The Industry 4.0 revolution has been made possible via AI-based applications (e.g., for automation and maintenance) deployed on the serverless edge (aka fog) computing platforms at the industrial sites -- where the data is generated. Nevertheless, fulfilling the fault-intolerant and real-time constraints of Industry 4.0 applications on resource-limited fog systems in remote industrial sites (e.g.,… ▽ More The Industry 4.0 revolution has been made possible via AI-based applications (e.g., for automation and maintenance) deployed on the serverless edge (aka fog) computing platforms at the industrial sites -- where the data is generated. Nevertheless, fulfilling the fault-intolerant and real-time constraints of Industry 4.0 applications on resource-limited fog systems in remote industrial sites (e.g., offshore oil fields) that are uncertain, disaster-prone, and have no cloud access is challenging. It is this challenge that our research aims at addressing. We consider the inelastic nature of the fog systems, software architecture of the industrial applications (micro-service-based versus monolithic), and scarcity of human experts in remote sites. To enable cloud-like elasticity, our approach is to dynamically and seamlessly (i.e., without human intervention) federate nearby fog systems. Then, we develop serverless resource allocation solutions that are cognizant of the applications' software architecture, their latency requirements, and distributed nature of the underlying infrastructure. We propose methods to seamlessly and optimally partition micro-service-based application across the federated fog. Our experimental evaluation express that not only the elasticity is overcome in a serverless manner, but also our developed application partitioning method can serve around 20% more tasks on-time than the existing methods in the literature. △ Less

Submitted 13 January, 2024; originally announced January 2024.

Comments: Accepted in the Future Generation Computer Systems (FGCS) Journal

arXiv:2312.03235 [pdf, other]

HEET: A Heterogeneity Measure to Quantify the Difference across Distributed Computing Systems

Authors: Ali Mokhtari, Saeid Ghafouri, Pooyan Jamshidi, Mohsen Amini Salehi

Abstract: Although system heterogeneity has been extensively studied in the past, there is yet to be a study on measuring the impact of heterogeneity on system performance. For this purpose, we propose a heterogeneity measure that can characterize the impact of the heterogeneity of a system on its performance behavior in terms of throughput or makespan. We develop a mathematical model to characterize a hete… ▽ More Although system heterogeneity has been extensively studied in the past, there is yet to be a study on measuring the impact of heterogeneity on system performance. For this purpose, we propose a heterogeneity measure that can characterize the impact of the heterogeneity of a system on its performance behavior in terms of throughput or makespan. We develop a mathematical model to characterize a heterogeneous system in terms of its task and machine heterogeneity dimensions and then reduce it to a single value, called Homogeneous Equivalent Execution Time (HEET), which represents the execution time behavior of the entire system. We used AWS EC2 instances to implement a real-world machine learning inference system. Performance evaluation of the HEET score across different heterogeneous system configurations demonstrates that HEET can accurately characterize the performance behavior of these systems. In particular, the results show that our proposed method is capable of predicting the true makespan of heterogeneous systems without online evaluations with an average precision of 84%. This heterogeneity measure is instrumental for solution architects to configure their systems proactively to be sufficiently heterogeneous to meet their desired performance objectives. △ Less

Submitted 5 December, 2023; originally announced December 2023.

arXiv:2309.03168 [pdf, other]

UMS: Live Migration of Containerized Services across Autonomous Computing Systems

Authors: Thanawat Chanikaphon, Mohsen Amini Salehi

Abstract: Containerized services deployed within various computing systems, such as edge and cloud, desire live migration support to enable user mobility, elasticity, and load balancing. To enable such a ubiquitous and efficient service migration, a live migration solution needs to handle circumstances where users have various authority levels (full control, limited control, or no control) over the underlyi… ▽ More Containerized services deployed within various computing systems, such as edge and cloud, desire live migration support to enable user mobility, elasticity, and load balancing. To enable such a ubiquitous and efficient service migration, a live migration solution needs to handle circumstances where users have various authority levels (full control, limited control, or no control) over the underlying computing systems. Supporting the live migration at these levels serves as the cornerstone of interoperability, and can unlock several use cases across various forms of distributed systems. As such, in this study, we develop a ubiquitous migration solution (called UMS) that, for a given containerized service, can automatically identify the feasible migration approach, and then seamlessly perform the migration across autonomous computing systems. UMS does not interfere with the way the orchestrator handles containers and can coordinate the migration without the orchestrator involvement. Moreover, UMS is orchestrator-agnostic, i.e., it can be plugged into any underlying orchestrator platform. UMS is equipped with novel methods that can coordinate and perform the live migration at the orchestrator, container, and service levels. Experimental results show that for single-process containers, the service-level approach, and for multi-process containers with small (< 128 MiB) memory footprint, the container-level migration approach lead to the lowest migration overhead and service downtime. To demonstrate the potential of UMS in realizing interoperability and multi-cloud scenarios, we examined it to perform live service migration across heterogeneous orchestrators, and between Microsoft Azure and Google Cloud △ Less

Submitted 6 September, 2023; originally announced September 2023.

Comments: Accepted in IEEE Globecom 2023 conference

arXiv:2307.16447 [pdf, other]

Confidential Computing across Edge-to-Cloud for Machine Learning: A Survey Study

Authors: SM Zobaed, Mohsen Amini Salehi

Abstract: Confidential computing has gained prominence due to the escalating volume of data-driven applications (e.g., machine learning and big data) and the acute desire for secure processing of sensitive data, particularly, across distributed environments, such as edge-to-cloud continuum. Provided that the works accomplished in this emerging area are scattered across various research fields, this paper ai… ▽ More Confidential computing has gained prominence due to the escalating volume of data-driven applications (e.g., machine learning and big data) and the acute desire for secure processing of sensitive data, particularly, across distributed environments, such as edge-to-cloud continuum. Provided that the works accomplished in this emerging area are scattered across various research fields, this paper aims at surveying the fundamental concepts, and cutting-edge software and hardware solutions developed for confidential computing using trusted execution environments, homomorphic encryption, and secure enclaves. We underscore the significance of building trust in both hardware and software levels and delve into their applications particularly for machine learning (ML) applications. While substantial progress has been made, there are some barely-explored areas that need extra attention from the researchers and practitioners in the community to improve confidentiality aspects, develop more robust attestation mechanisms, and to address vulnerabilities of the existing trusted execution environments. Providing a comprehensive taxonomy of the confidential computing landscape, this survey enables researchers to advance this field to ultimately ensure the secure processing of users' sensitive data across a multitude of applications and computing tiers. △ Less

Submitted 31 July, 2023; originally announced July 2023.

arXiv:2303.10901 [pdf, other]

E2C: A Visual Simulator to Reinforce Education of Heterogeneous Computing Systems

Authors: Ali Mokhtari, Drake Rawls, Tony Huynh, Jeremiah Green, Mohsen Amini Salehi

Abstract: With the increasing popularity of accelerator technologies (e.g., GPUs and TPUs) and the emergence of domain-specific computing via ASICs and FPGA, the matter of heterogeneity and understanding its ramifications on the performance has become more critical than ever before. However, it is challenging to effectively educate students about the potential impacts of heterogeneity on the performance of… ▽ More With the increasing popularity of accelerator technologies (e.g., GPUs and TPUs) and the emergence of domain-specific computing via ASICs and FPGA, the matter of heterogeneity and understanding its ramifications on the performance has become more critical than ever before. However, it is challenging to effectively educate students about the potential impacts of heterogeneity on the performance of distributed systems; and on the logic of resource allocation methods to efficiently utilize the resources. Making use of the real infrastructure for benchmarking the performance of heterogeneous machines, for different applications, with respect to different objectives, and under various workload intensities is cost- and time-prohibitive. To reinforce the quality of learning about various dimensions of heterogeneity, and to decrease the widening gap in education, we develop an open-source simulation tool, called E2C, that can help students researchers to study any type of heterogeneous (or homogeneous) computing system and measure its performance under various configurations. E2C is equipped with an intuitive graphical user interface (GUI) that enables its users to easily examine system-level solutions (scheduling, load balancing, scalability, etc.) in a controlled environment within a short time. E2C is a discrete event simulator that offers the following features: (i) simulating a heterogeneous computing system; (ii) implementing a newly developed scheduling method and plugging it into the system, (iii) measuring energy consumption and other output-related metrics; and (iv) powerful visual aspects to ease the learning curve for students. We used E2C as an assignment in the Distributed and Cloud Computing course. Our anonymous survey study indicates that students rated E2C with the score of 8.7 out of 10 for its usefulness in understanding the concepts of scheduling in heterogeneous computing. △ Less

Submitted 20 March, 2023; originally announced March 2023.

Comments: Accepted in Edupar '23, as part of IPDPS '23 Conference. arXiv admin note: text overlap with arXiv:2212.11333

arXiv:2301.00484 [pdf, other]

Federated Fog Computing for Remote Industry 4.0 Applications

Authors: Razin Farhan Hussain, Mohsen Amini Salehi

Abstract: Industry 4.0 operates based on IoT devices, sensors, and actuators, transforming the use of computing resources and software solutions in diverse sectors. Various Industry 4.0 latency-sensitive applications function based on machine learning to process sensor data for automation and other industrial activities. Sending sensor data to cloud systems is time consuming and detrimental to the latency c… ▽ More Industry 4.0 operates based on IoT devices, sensors, and actuators, transforming the use of computing resources and software solutions in diverse sectors. Various Industry 4.0 latency-sensitive applications function based on machine learning to process sensor data for automation and other industrial activities. Sending sensor data to cloud systems is time consuming and detrimental to the latency constraints of the applications, thus, fog computing is often deployed. Executing these applications across heterogeneous fog systems demonstrates stochastic execution time behavior that affects the task completion time. We investigate and model various Industry 4.0 ML-based applications' stochastic executions and analyze them. Industries like oil and gas are prone to disasters requiring coordination of various latency-sensitive activities. Hence, fog computing resources can get oversubscribed due to the surge in the computing demands during a disaster. We propose federating nearby fog computing systems and forming a fog federation to make remote Industry 4.0 sites resilient against the surge in computing demands. We propose a statistical resource allocation method across fog federation for latency-sensitive tasks. Many of the modern Industry 4.0 applications operate based on a workflow of micro-services that are used alone within an industrial site. As such, industry 4.0 solutions need to be aware of applications' architecture, particularly monolithic vs. micro-service. Therefore, we propose a probability-based resource allocation method that can partition micro-service workflows across fog federation to meet their latency constraints. Another concern in Industry 4.0 is the data privacy of the federated fog. As such, we propose a solution based on federated learning to train industrial ML applications across federated fog systems without compromising the data confidentiality. △ Less

Submitted 1 January, 2023; originally announced January 2023.

Comments: PhD Dissertation

arXiv:2212.14198 [pdf, other]

Load Balancer Tuning: Comparative Analysis of HAProxy Load Balancing Methods

Authors: Connor Rawls, Mohsen Amini Salehi

Abstract: Load balancing is prevalent in practical application (e.g., web) deployments seen today. One such load balancer, HAProxy, remains relevant as an open-source, easy-to-use system. In the context of web systems, the load balancer tier possesses significant influence over system performance and the incurred cost, which is decisive for cloud-based deployments. Therefore, it is imperative to properly tu… ▽ More Load balancing is prevalent in practical application (e.g., web) deployments seen today. One such load balancer, HAProxy, remains relevant as an open-source, easy-to-use system. In the context of web systems, the load balancer tier possesses significant influence over system performance and the incurred cost, which is decisive for cloud-based deployments. Therefore, it is imperative to properly tune the load balancer configuration and get the most performance out of the existing resources. In this technical report, we first introduce the HAProxy architecture and its load balancing methods. Then, we discuss fine-tuning parameters within this load balancer and examine their performances in face of various workload intensities. Our evaluation encompasses various types of web requests and homogeneous and heterogeneous back-ends. Lastly, based on the findings of this study, we present a set of best practices to optimally configure HAProxy. △ Less

Submitted 29 December, 2022; originally announced December 2022.

Comments: Technical Report

arXiv:2212.11333 [pdf, other]

E2C: A Visual Simulator for Heterogeneous Computing Systems

Authors: Ali Mokhtari, Mohsen Amini Salehi

Abstract: Heterogeneity has been an indispensable aspect of distributed computing throughout the history of these systems. In particular, with the increasing prevalence of accelerator technologies (e.g., GPUs and TPUs) and the emergence of domain-specific computing via ASICs and FPGA, the matter of heterogeneity and harnessing it has become a more critical challenge than ever before. Harnessing system heter… ▽ More Heterogeneity has been an indispensable aspect of distributed computing throughout the history of these systems. In particular, with the increasing prevalence of accelerator technologies (e.g., GPUs and TPUs) and the emergence of domain-specific computing via ASICs and FPGA, the matter of heterogeneity and harnessing it has become a more critical challenge than ever before. Harnessing system heterogeneity has been a longstanding challenge in distributed systems and has been investigated extensively in the past. Making use of real infrastructure (such as those offered by the public cloud providers) for benchmarking the performance of heterogeneous machines, for different applications, with respect to different objectives, and under various workload intensities is cost- and time-prohibitive. To mitigate this burden, we develop an open-source simulation tool, called E2C, that can help researchers and practitioners study any type of heterogeneous computing system and measure its performance under various system configurations. E2C has an intuitive graphical user interface (GUI) that enables its users to easily examine system-level solutions (scheduling, load balancing, scalability, etc.) in a controlled environment within a short time and at no cost. In particular, E2C offers the following features: (i) simulating a heterogeneous computing system; (ii) implementing a newly developed scheduling method and plugging it into the system, (iii) measuring energy consumption and other output-related metrics; and (iv) powerful visual aspects to ease the learning curve for students. Potential users of E2C can be undergraduate and graduate students in computer science/engineering, researchers, and practitioners. △ Less

Submitted 21 December, 2022; originally announced December 2022.

Comments: https://hpcclab.github.io/E2C-Sim-docs/

Journal ref: Tutorial at 15th ACM/IEEE Utility Cloud Computing (UCC '22) conference, Vancouver, Washington, USA, Dec. 2022

arXiv:2211.07130 [pdf, other]

Edge-MultiAI: Multi-Tenancy of Latency-Sensitive Deep Learning Applications on Edge

Authors: SM Zobaed, Ali Mokhtari, Jaya Prakash Champati, Mathieu Kourouma, Mohsen Amini Salehi

Abstract: Smart IoT-based systems often desire continuous execution of multiple latency-sensitive Deep Learning (DL) applications. The edge servers serve as the cornerstone of such IoT-based systems, however, their resource limitations hamper the continuous execution of multiple (multi-tenant) DL applications. The challenge is that, DL applications function based on bulky "neural network (NN) models" that c… ▽ More Smart IoT-based systems often desire continuous execution of multiple latency-sensitive Deep Learning (DL) applications. The edge servers serve as the cornerstone of such IoT-based systems, however, their resource limitations hamper the continuous execution of multiple (multi-tenant) DL applications. The challenge is that, DL applications function based on bulky "neural network (NN) models" that cannot be simultaneously maintained in the limited memory space of the edge. Accordingly, the main contribution of this research is to overcome the memory contention challenge, thereby, meeting the latency constraints of the DL applications without compromising their inference accuracy. We propose an efficient NN model management framework, called Edge-MultiAI, that ushers the NN models of the DL applications into the edge memory such that the degree of multi-tenancy and the number of warm-starts are maximized. Edge-MultiAI leverages NN model compression techniques, such as model quantization, and dynamically loads NN models for DL applications to stimulate multi-tenancy on the edge server. We also devise a model management heuristic for Edge-MultiAI, called iWS-BFE, that functions based on the Bayesian theory to predict the inference requests for multi-tenant applications, and uses it to choose the appropriate NN models for loading, hence, increasing the number of warm-start inferences. We evaluate the efficacy and robustness of Edge-MultiAI under various configurations. The results reveal that Edge-MultiAI can stimulate the degree of multi-tenancy on the edge by at least 2X and increase the number of warm-starts by around 60% without any major loss on the inference accuracy of the applications. △ Less

Submitted 14 November, 2022; originally announced November 2022.

Comments: Accepted in Utility Cloud Computing Conference 2022

arXiv:2206.05361 [pdf, other]

Object as a Service (OaaS): Enabling Object Abstraction in Serverless Clouds

Authors: Pawissanutt Lertpongrujikorn, Mohsen Amini Salehi

Abstract: Function as a Service (FaaS) paradigm is becoming widespread and is envisioned as the next generation of cloud systems that mitigate the burden for programmers and cloud solution architects. However, the FaaS abstraction only makes the cloud resource management aspects transparent but does not deal with the application data aspects. As such, developers have to undergo the burden of managing the ap… ▽ More Function as a Service (FaaS) paradigm is becoming widespread and is envisioned as the next generation of cloud systems that mitigate the burden for programmers and cloud solution architects. However, the FaaS abstraction only makes the cloud resource management aspects transparent but does not deal with the application data aspects. As such, developers have to undergo the burden of managing the application data, often via separate cloud services (e.g., AWS S3). Similarly, the FaaS abstraction does not natively support function workflow, hence, the developers often have to work with workflow orchestration services (e.g., AWS Step Functions) to build workflows. Moreover, they have to explicitly navigate the data throughout the workflow. To overcome these problems of FaaS, we design a higher-level cloud programming abstraction that hides the complexities and mitigate the burden of developing cloud-native application development. We borrow the notion of object from object-oriented programming and propose a new abstraction level atop the function abstraction, known as Object as a Service (OaaS). OaaS encapsulates the application data and function into the object abstraction and relieves the developers from resource and data management burdens. It also unlocks opportunities for built-in optimization features, such as software reusability, data locality, and caching. OaaS natively supports dataflow programming such that developers define a workflow of functions transparently without getting involved in data navigation, synchronization, and parallelism aspects. We implemented a prototype of the OaaS platform and evaluated it under real-world settings against state-of-the-art platforms regarding the imposed overhead, scalability, and ease of use. The results demonstrate that OaaS streamlines cloud programming and offers scalability with an insignificant overhead to the underlying cloud system. △ Less

Submitted 5 September, 2023; v1 submitted 10 June, 2022; originally announced June 2022.

Comments: This version of the paper has been significantly altered and the new observations have been obtained. Therefore, we withdraw the paper until the new version becomes available

Journal ref: IEEE Cloud 2023

arXiv:2206.00065 [pdf, other]

FELARE: Fair Scheduling of Machine Learning Tasks on Heterogeneous Edge Systems

Authors: Ali Mokhtari, Md Abir Hossen, Pooyan Jamshidi, Mohsen Amini Salehi

Abstract: Edge computing enables smart IoT-based systems via concurrent and continuous execution of latency-sensitive machine learning (ML) applications. These edge-based machine learning systems are often battery-powered (i.e., energy-limited). They use heterogeneous resources with diverse computing performance (e.g., CPU, GPU, and/or FPGAs) to fulfill the latency constraints of ML applications. The challe… ▽ More Edge computing enables smart IoT-based systems via concurrent and continuous execution of latency-sensitive machine learning (ML) applications. These edge-based machine learning systems are often battery-powered (i.e., energy-limited). They use heterogeneous resources with diverse computing performance (e.g., CPU, GPU, and/or FPGAs) to fulfill the latency constraints of ML applications. The challenge is to allocate user requests for different ML applications on the Heterogeneous Edge Computing Systems (HEC) with respect to both the energy and latency constraints of these systems. To this end, we study and analyze resource allocation solutions that can increase the on-time task completion rate while considering the energy constraint. Importantly, we investigate edge-friendly (lightweight) multi-objective mapping heuristics that do not become biased toward a particular application type to achieve the objectives; instead, the heuristics consider "fairness" across the concurrent ML applications in their mapping decisions. Performance evaluations demonstrate that the proposed heuristic outperforms widely-used heuristics in heterogeneous systems in terms of the latency and energy objectives, particularly, at low to moderate request arrival rates. We observed 8.9% improvement in on-time task completion rate and 12.6% in energy-saving without imposing any significant overhead on the edge system. △ Less

Submitted 20 July, 2022; v1 submitted 31 May, 2022; originally announced June 2022.

arXiv:2201.01940 [pdf, other]

SMSE: A Serverless Platform for Multimedia Cloud Systems

Authors: Chavit Denninnart, Mohsen Amini Salehi

Abstract: Along with the rise of domain-specific computing (ASICs hardware) and domain-specific programming languages, we envision that the next step is the emergence of domain-specific cloud platforms. Developing such platforms for popular applications in the serverless manner, not only can offer a higher efficiency to both users and providers, it can also expedite the application development cycles and en… ▽ More Along with the rise of domain-specific computing (ASICs hardware) and domain-specific programming languages, we envision that the next step is the emergence of domain-specific cloud platforms. Developing such platforms for popular applications in the serverless manner, not only can offer a higher efficiency to both users and providers, it can also expedite the application development cycles and enable users to become solution-oriented and focus on their specific business logic. Considering multimedia streaming as one of the most trendy applications in the IT industry, the goal of this study is to develop SMSE, the first domain-specific serverless platform for multimedia streaming. SMSE democratizes multimedia service development via enabling content providers (or even end-users) to rapidly develop their desired functionalities on their multimedia contents. Upon developing SMSE, the next goal of this study is to deal with its efficiency challenges and develop a function container provisioning method that can efficiently utilize cloud resources and improve the users' QoS. In particular, we develop a dynamic method that provisions durable or ephemeral containers depending on the spatiotemporal and data-dependency characteristics of the functions. Evaluating the prototype implementation of SMSE under real-world settings demonstrates its capability to reduce both the containerization overhead, and the makespan time of serving multimedia processing functions (by up to 30%) in compare to the function provision methods that are being used in the general-purpose serverless cloud systems. △ Less

Submitted 29 September, 2023; v1 submitted 6 January, 2022; originally announced January 2022.

Comments: Accepted in the Journal of Concurrency and Computation: Practice and Experience (CCPE)

arXiv:2112.09780 [pdf, other]

Exploring the Impact of Virtualization on the Usability of the Deep Learning Applications

Authors: Davood G. Samani, Mohsen Amini Salehi

Abstract: Deep Learning-based (DL) applications are becoming increasingly popular and advancing at an unprecedented pace. While many research works are being undertaken to enhance Deep Neural Networks (DNN) -- the centerpiece of DL applications -- practical deployment challenges of these applications in the Cloud and Edge systems, and their impact on the usability of the applications have not been sufficien… ▽ More Deep Learning-based (DL) applications are becoming increasingly popular and advancing at an unprecedented pace. While many research works are being undertaken to enhance Deep Neural Networks (DNN) -- the centerpiece of DL applications -- practical deployment challenges of these applications in the Cloud and Edge systems, and their impact on the usability of the applications have not been sufficiently investigated. In particular, the impact of deploying different virtualization platforms, offered by the Cloud and Edge, on the usability of DL applications (in terms of the End-to-End (E2E) inference time) has remained an open question. Importantly, resource elasticity (by means of scale-up), CPU pinning, and processor type (CPU vs GPU) configurations have shown to be influential on the virtualization overhead. Accordingly, the goal of this research is to study the impact of these potentially decisive deployment options on the E2E performance, thus, usability of the DL applications. To that end, we measure the impact of four popular execution platforms (namely, bare-metal, virtual machine (VM), container, and container in VM) on the E2E inference time of four types of DL applications, upon changing processor configuration (scale-up, CPU pinning) and processor types. This study reveals a set of interesting and sometimes counter-intuitive findings that can be used as best practices by Cloud solution architects to efficiently deploy DL applications in various systems. The notable finding is that the solution architects must be aware of the DL application characteristics, particularly, their pre- and post-processing requirements, to be able to optimally choose and configure an execution platform, determine the use of GPU, and decide the efficient scale-up range. △ Less

Submitted 17 December, 2021; originally announced December 2021.

arXiv:2110.06508 [pdf, other]

doi 10.1002/spe.3233

Efficiency in the Serverless Cloud Paradigm: A Survey on the Reusing and Approximation Aspects

Authors: Chavit Denninnart, Thanawat Chanikaphon, Mohsen Amini Salehi

Abstract: Serverless computing along with Function-as-a-Service (FaaS) is forming a new computing paradigm that is anticipated to found the next generation of cloud systems. The popularity of this paradigm is due to offering a highly transparent infrastructure that enables user applications to scale in the granularity of their functions. Since these often small and single-purpose functions are managed on sh… ▽ More Serverless computing along with Function-as-a-Service (FaaS) is forming a new computing paradigm that is anticipated to found the next generation of cloud systems. The popularity of this paradigm is due to offering a highly transparent infrastructure that enables user applications to scale in the granularity of their functions. Since these often small and single-purpose functions are managed on shared computing resources behind the scene, a great potential for computational reuse and approximate computing emerges that if unleashed, can remarkably improve the efficiency of serverless cloud systems -- both from the user's QoS and system's (energy consumption and incurred cost) perspectives. Accordingly, the goal of this survey study is to, first, unfold the internal mechanics of serverless computing and, second, explore the scope for efficiency within this paradigm via studying function reuse and approximation approaches and discussing the pros and cons of each one. Next, we outline potential future research directions within this paradigm that can either unlock new use cases or make the paradigm more efficient. △ Less

Submitted 25 June, 2023; v1 submitted 13 October, 2021; originally announced October 2021.

Journal ref: Journal of Software-Practice and Experience (SPE), June 2023

arXiv:2104.04474 [pdf, other]

Harnessing the Potential of Function-Reuse in Multimedia Cloud Systems

Authors: Chavit Denninnart, Mohsen Amini Salehi

Abstract: Cloud-based computing systems can get oversubscribed due to the budget constraints of their users or limitations in certain resource types. The oversubscription can, in turn, degrade the users perceived Quality of Service (QoS). The approach we investigate to mitigate both the oversubscription and the incurred cost is based on smart reusing of the computation needed to process the service requests… ▽ More Cloud-based computing systems can get oversubscribed due to the budget constraints of their users or limitations in certain resource types. The oversubscription can, in turn, degrade the users perceived Quality of Service (QoS). The approach we investigate to mitigate both the oversubscription and the incurred cost is based on smart reusing of the computation needed to process the service requests (i.e., tasks). We propose a reusing paradigm for the tasks that are waiting for execution. This paradigm can be particularly impactful in serverless platforms where multiple users can request similar services simultaneously. Our motivation is a multimedia streaming engine that processes the media segments in an on-demand manner. We propose a mechanism to identify various types of "mergeable" tasks and aggregate them to improve the QoS and mitigate the incurred cost. We develop novel approaches to determine when and how to perform task aggregation such that the QoS of other tasks is not affected. Evaluation results show that the proposed mechanism can improve the QoS by significantly reducing the percentage of tasks missing their deadlines %. In addition, it can and reduce the overall time (and subsequently the incurred cost) of utilizing cloud services by more than 9%. △ Less

Submitted 9 April, 2021; originally announced April 2021.

arXiv:2102.13367 [pdf, other]

SAED: Edge-Based Intelligence for Privacy-Preserving Enterprise Search on the Cloud

Authors: Sakib M Zobaed, Mohsen Amini Salehi, Rajkumar Buyya

Abstract: Cloud-based enterprise search services (e.g., AWS Kendra) have been entrancing big data owners by offering convenient and real-time search solutions to them. However, the problem is that individuals and organizations possessing confidential big data are hesitant to embrace such services due to valid data privacy concerns. In addition, to offer an intelligent search, these services access the user… ▽ More Cloud-based enterprise search services (e.g., AWS Kendra) have been entrancing big data owners by offering convenient and real-time search solutions to them. However, the problem is that individuals and organizations possessing confidential big data are hesitant to embrace such services due to valid data privacy concerns. In addition, to offer an intelligent search, these services access the user search history that further jeopardizes his/her privacy. To overcome the privacy problem, the main idea of this research is to separate the intelligence aspect of the search from its pattern matching aspect. According to this idea, the search intelligence is provided by an on-premises edge tier and the shared cloud tier only serves as an exhaustive pattern matching search utility. We propose Smartness At Edge (SAED mechanism that offers intelligence in the form of semantic and personalized search at the edge tier while maintaining privacy of the search on the cloud tier. At the edge tier, SAED uses a knowledge-based lexical database to expand the query and cover its semantics. SAED personalizes the search via an RNN model that can learn the user interest. A word embedding model is used to retrieve documents based on their semantic relevance to the search query. SAED is generic and can be plugged into existing enterprise search systems and enable them to offer intelligent and privacy-preserving search without enforcing any change on them. Evaluation results on two enterprise search systems under real settings and verified by human users demonstrate that SAED can improve the relevancy of the retrieved results by on average 24% for plain-text and 75% for encrypted generic datasets. △ Less

Submitted 11 March, 2021; v1 submitted 26 February, 2021; originally announced February 2021.

arXiv:2102.05260 [pdf, other]

SensPick: Sense Picking for Word Sense Disambiguation

Authors: Sm Zobaed, Md Enamul Haque, Md Fazle Rabby, Mohsen Amini Salehi

Abstract: Word sense disambiguation (WSD) methods identify the most suitable meaning of a word with respect to the usage of that word in a specific context. Neural network-based WSD approaches rely on a sense-annotated corpus since they do not utilize lexical resources. In this study, we utilize both context and related gloss information of a target word to model the semantic relationship between the word a… ▽ More Word sense disambiguation (WSD) methods identify the most suitable meaning of a word with respect to the usage of that word in a specific context. Neural network-based WSD approaches rely on a sense-annotated corpus since they do not utilize lexical resources. In this study, we utilize both context and related gloss information of a target word to model the semantic relationship between the word and the set of glosses. We propose SensPick, a type of stacked bidirectional Long Short Term Memory (LSTM) network to perform the WSD task. The experimental evaluation demonstrates that SensPick outperforms traditional and state-of-the-art models on most of the benchmark datasets with a relative improvement of 3.5% in F-1 score. While the improvement is not significant, incorporating semantic relationships brings SensPick in the leading position compared to others. △ Less

Submitted 9 February, 2021; originally announced February 2021.

Journal ref: 16th IEEE International Conference on Semantic Computing, ICSC'2021

arXiv:2012.06054 [pdf, other]

Analyzing the Performance of Smart Industry 4.0 Applications on Cloud Computing Systems

Authors: Razin Farhan Hussain, Alireza Pakravan, Mohsen Amini Salehi

Abstract: Cloud-based Deep Neural Network (DNN) applications that make latency-sensitive inference are becoming an indispensable part of Industry 4.0. Due to the multi-tenancy and resource heterogeneity, both inherent to the cloud computing environments, the inference time of DNN-based applications are stochastic. Such stochasticity, if not captured, can potentially lead to low Quality of Service (QoS) or e… ▽ More Cloud-based Deep Neural Network (DNN) applications that make latency-sensitive inference are becoming an indispensable part of Industry 4.0. Due to the multi-tenancy and resource heterogeneity, both inherent to the cloud computing environments, the inference time of DNN-based applications are stochastic. Such stochasticity, if not captured, can potentially lead to low Quality of Service (QoS) or even a disaster in critical sectors, such as Oil and Gas industry. To make Industry 4.0 robust, solution architects and researchers need to understand the behavior of DNN-based applications and capture the stochasticity exists in their inference times. Accordingly, in this study, we provide a descriptive analysis of the inference time from two perspectives. First, we perform an application-centric analysis and statistically model the execution time of four categorically different DNN applications on both Amazon and Chameleon clouds. Second, we take a resource-centric approach and analyze a rate-based metric in form of Million Instruction Per Second (MIPS) for heterogeneous machines in the cloud. This non-parametric modeling, achieved via Jackknife and Bootstrap re-sampling methods, provides the confidence interval of MIPS for heterogeneous cloud machines. The findings of this research can be helpful for researchers and cloud solution architects to develop solutions that are robust against the stochastic nature of the inference time of DNN applications in the cloud and can offer a higher QoS to their users and avoid unintended outcomes. △ Less

Submitted 10 December, 2020; originally announced December 2020.

Journal ref: IEEE International Conference on High Performance Computing and Communications (HPCC 2020)

arXiv:2012.06021 [pdf, other]

Descriptive and Predictive Analysis of Aggregating Functions in Serverless Clouds: the Case of Video Streaming

Authors: Shangrui Wu, Chavit Denninnart, Xiangbo Li, Yang Wang, Mohsen Amini Salehi

Abstract: Serverless clouds allocate multiple tasks (e.g., micro-services) from multiple users on a shared pool of computing resources. This enables serverless cloud providers to reduce their resource usage by transparently aggregate similar tasks of a certain context (e.g., video processing) that share the whole or part of their computation. To this end, it is crucial to know the amount of time-saving achi… ▽ More Serverless clouds allocate multiple tasks (e.g., micro-services) from multiple users on a shared pool of computing resources. This enables serverless cloud providers to reduce their resource usage by transparently aggregate similar tasks of a certain context (e.g., video processing) that share the whole or part of their computation. To this end, it is crucial to know the amount of time-saving achieved by aggregating the tasks. Lack of such knowledge can lead to uninformed merging and scheduling decisions that, in turn, can cause deadline violation of either the merged tasks or other following tasks. Accordingly, in this paper, we study the problem of estimating execution-time saving resulted from merging tasks with the example in the context of video processing. To learn the execution-time saving in different forms of merging, we first establish a set of benchmarking videos and examine a wide variety of video processing tasks -- with and without merging in place. We observed that although merging can save up to 44% in the execution-time, the number of possible merging cases is intractable. Hence, in the second part, we leverage the benchmarking results and develop a method based on Gradient Boosting Decision Tree (GBDT) to estimate the time-saving for any given task merging case. Experimental results show that the method can estimate the time-saving with the error rate of 0.04, measured based on Root Mean Square Error (RMSE). △ Less

Submitted 10 December, 2020; originally announced December 2020.

Journal ref: IEEE HPCC 2020

arXiv:2011.14976 [pdf, other]

Cloud-Based Video Streaming Services: A Survey

Authors: Xiangbo Li, Mahmoud Darwich, Magdy Bayoumi, Mohsen Amini Salehi

Abstract: Video streaming, in various forms of video on demand (VOD), live, and 360 degree streaming, has grown dramatically during the past few years. In comparison to traditional cable broadcasters whose contents can only be watched on TVs, video streaming is ubiquitous and viewers can flexibly watch the video contents on various devices, ranging from smart-phones to laptops and large TV screens. Such ubi… ▽ More Video streaming, in various forms of video on demand (VOD), live, and 360 degree streaming, has grown dramatically during the past few years. In comparison to traditional cable broadcasters whose contents can only be watched on TVs, video streaming is ubiquitous and viewers can flexibly watch the video contents on various devices, ranging from smart-phones to laptops and large TV screens. Such ubiquity and flexibility are enabled by interweaving multiple technologies, such as video compression, cloud computing, content delivery networks, and several other technologies. As video streaming gains more popularity and dominates the Internet traffic, it is essential to understand the way it operates and the interplay of different technologies involved in it. Accordingly, the first goal of this paper is to unveil sophisticated processes to deliver a raw captured video to viewers' devices. In particular, we elaborate on the video encoding, transcoding, packaging, encryption, and delivery processes. We survey recent efforts in academia and industry to enhance these processes. As video streaming industry is increasingly becoming reliant on cloud computing, the second goal of this survey is to explore and survey the ways cloud services are utilized to enable video streaming services. The third goal of the study is to position the undertaken research works in cloud-based video streaming and identify challenges that need to be obviated in future to advance cloud-based video streaming industry to a more flexible and user-centric service. △ Less

Submitted 30 November, 2020; originally announced November 2020.

Comments: accepted to be published in Elsevier book chapter Advances in Computers Volume 123

arXiv:2006.02055 [pdf, other]

The Art of CPU-Pinning: Evaluating and Improving the Performance of Virtualization and Containerization Platforms

Authors: Davood Ghatreh Samani, Chavit Denninnart, Josef Bacik, Mohsen Amini Salehi

Abstract: Cloud providers offer a variety of execution platforms in form of bare-metal, VM, and containers. However, due to the pros and cons of each execution platform, choosing the appropriate platform for a specific cloud-based application has become a challenge for solution architects. The possibility to combine these platforms (e.g. deploying containers within VMs) offers new capacities that makes the… ▽ More Cloud providers offer a variety of execution platforms in form of bare-metal, VM, and containers. However, due to the pros and cons of each execution platform, choosing the appropriate platform for a specific cloud-based application has become a challenge for solution architects. The possibility to combine these platforms (e.g. deploying containers within VMs) offers new capacities that makes the challenge even further complicated. However, there is a little study in the literature on the pros and cons of deploying different application types on various execution platforms. In particular, evaluation of diverse hardware configurations and different CPU provisioning methods, such as CPU pinning, have not been sufficiently studied in the literature. In this work, the performance overhead of container, VM, and bare-metal execution platforms are measured and analyzed for four categories of real-world applications, namely video processing, parallel processing (MPI), web processing, and No-SQL, respectively representing CPU intensive, parallel processing, and two IO intensive processes. Our analyses reveal a set of interesting and sometimes counterintuitive findings that can be used as best practices by the solution architects to efficiently deploy cloud-based applications. Here are some notable mentions: (A) Under specific circumstances, containers can impose a higher overhead than VMs; (B) Containers on top of VMs can mitigate the overhead of VMs for certain applications; (C) Containers with a large number of cores impose a lower overhead than those with a few cores. △ Less

Submitted 3 June, 2020; originally announced June 2020.

Journal ref: The 49th International Conference on Parallel Processing (ICPP 2020)

arXiv:2005.11317 [pdf, other]

Privacy-Preserving Clustering of Unstructured Big Data for Cloud-Based Enterprise Search Solutions

Authors: SM Zobaed, Mohsen Amini Salehi

Abstract: Cloud-based enterprise search services (e.g., Amazon Kendra) are enchanting to big data owners by providing them with convenient search solutions over their enterprise big datasets. However, individuals and businesses that deal with confidential big data (eg, credential documents) are reluctant to fully embrace such services, due to valid concerns about data privacy. Solutions based on client-side… ▽ More Cloud-based enterprise search services (e.g., Amazon Kendra) are enchanting to big data owners by providing them with convenient search solutions over their enterprise big datasets. However, individuals and businesses that deal with confidential big data (eg, credential documents) are reluctant to fully embrace such services, due to valid concerns about data privacy. Solutions based on client-side encryption have been explored to mitigate privacy concerns. Nonetheless, such solutions hinder data processing, specifically clustering, which is pivotal in dealing with different forms of big data. For instance, clustering is critical to limit the search space and perform real-time search operations on big datasets. To overcome the hindrance in clustering encrypted big data, we propose privacy-preserving clustering schemes for three forms of unstructured encrypted big datasets, namely static, semi-dynamic, and dynamic datasets. To preserve data privacy, the proposed clustering schemes function based on statistical characteristics of the data and determine (A) the suitable number of clusters and (B) appropriate content for each cluster. Experimental results obtained from evaluating the clustering schemes on three different datasets demonstrate between 30% to 60% improvement on the clusters' coherency compared to other clustering schemes for encrypted data. Employing the clustering schemes in a privacy-preserving enterprise search system decreases its search time by up to 78%, while increases the search accuracy by up to 35%. △ Less

Submitted 8 June, 2022; v1 submitted 22 May, 2020; originally announced May 2020.

Comments: arXiv admin note: text overlap with arXiv:1908.04960

arXiv:2005.11050 [pdf, other]

Autonomous Task Dropping Mechanism to Achieve Robustness in Heterogeneous Computing Systems

Authors: Ali Mokhtari, Chavit Denninnart, Mohsen Amini Salehi

Abstract: Robustness of a distributed computing system is defined as the ability to maintain its performance in the presence of uncertain parameters. Uncertainty is a key problem in heterogeneous (and even homogeneous) distributed computing systems that perturbs system robustness. Notably, the performance of these systems is perturbed by uncertainty in both task execution time and arrival. Accordingly, our… ▽ More Robustness of a distributed computing system is defined as the ability to maintain its performance in the presence of uncertain parameters. Uncertainty is a key problem in heterogeneous (and even homogeneous) distributed computing systems that perturbs system robustness. Notably, the performance of these systems is perturbed by uncertainty in both task execution time and arrival. Accordingly, our goal is to make the system robust against these uncertainties. Considering task execution time as a random variable, we use probabilistic analysis to develop an autonomous proactive task dropping mechanism to attain our robustness goal. Specifically, we provide a mathematical model that identifies the optimality of a task dropping decision, so that the system robustness is maximized. Then, we leverage the mathematical model to develop a task dropping heuristic that achieves the system robustness within a feasible time complexity. Although the proposed model is generic and can be applied to any distributed system, we concentrate on heterogeneous computing (HC) systems that have a higher degree of exposure to uncertainty than homogeneous systems. Experimental results demonstrate that the autonomous proactive dropping mechanism can improve the system robustness by up to 20%. △ Less

Submitted 22 May, 2020; originally announced May 2020.

Journal ref: in 29th Heterogeneity in Computing Workshop (HCW 2019), in the Proceedings of the IPDPS 2019 Workshops & PhD Forum (IPDPSW)

arXiv:1908.04960 [pdf, other]

ClustCrypt: Privacy-Preserving Clustering of Unstructured Big Data in the Cloud

Authors: SM Zobaed, Sahan Ahmad, Raju Gottumukkala, Mohsen Amini Salehi

Abstract: Security and confidentiality of big data stored in the cloud are important concerns for many organizations to adopt cloud services. One common approach to address the concerns is client-side encryption where data is encrypted on the client machine before being stored in the cloud. Having encrypted data in the cloud, however, limits the ability of data clustering, which is a crucial part of many da… ▽ More Security and confidentiality of big data stored in the cloud are important concerns for many organizations to adopt cloud services. One common approach to address the concerns is client-side encryption where data is encrypted on the client machine before being stored in the cloud. Having encrypted data in the cloud, however, limits the ability of data clustering, which is a crucial part of many data analytics applications, such as search systems. To overcome the limitation, in this paper, we present an approach named ClustCrypt for efficient topic-based clustering of encrypted unstructured big data in the cloud. ClustCrypt dynamically estimates the optimal number of clusters based on the statistical characteristics of encrypted data. It also provides clustering approach for encrypted data. We deploy ClustCrypt within the context of a secure cloud-based semantic search system (S3BD). Experimental results obtained from evaluating ClustCrypt on three datasets demonstrate on average 60% improvement on clusters' coherency. ClustCrypt also decreases the search-time overhead by up to 78% and increases the accuracy of search results by up to 35% △ Less

Submitted 14 August, 2019; originally announced August 2019.

Comments: High Performance Computing and Communications (HPCC '19)

arXiv:1908.03668 [pdf, other]

Edge Computing for User-Centric Secure Search on Cloud-Based Encrypted Big Data

Authors: Sahan Ahmad, SM Zobaed, Raju Gottumukkala, Mohsen Amini Salehi

Abstract: Cloud service providers offer a low-cost and convenient solution to host unstructured data. However, cloud services act as third-party solutions and do not provide control of the data to users. This has raised security and privacy concerns for many organizations (users) with sensitive data to utilize cloud-based solutions. User-side encryption can potentially address these concerns by establishing… ▽ More Cloud service providers offer a low-cost and convenient solution to host unstructured data. However, cloud services act as third-party solutions and do not provide control of the data to users. This has raised security and privacy concerns for many organizations (users) with sensitive data to utilize cloud-based solutions. User-side encryption can potentially address these concerns by establishing user-centric cloud services and granting data control to the user. Nonetheless, user-side encryption limits the ability to process (e.g., search) encrypted data on the cloud. Accordingly, in this research, we provide a framework that enables processing (in particular, searching) of encrypted multi-organizational (i.e., multi-source) big data without revealing the data to cloud provider. Our framework leverages locality feature of edge computing to offer a user-centric search ability in a real-time manner. In particular, the edge system intelligently predicts the user's search pattern and prunes the multi-source big data search space to reduce the search time. The pruning system is based on efficient sampling from the clustered big dataset on the cloud. For each cluster, the pruning system dynamically samples appropriate number of terms based on the user's search tendency, so that the cluster is optimally represented. We developed a prototype of a user-centric search system and evaluated it against multiple datasets. Experimental results demonstrate 27% improvement in the pruning quality and search accuracy. △ Less

Submitted 9 August, 2019; originally announced August 2019.

Comments: High Performance Computing and Communications (HPCC '19)

arXiv:1905.04460 [pdf, other]

Serverless Edge Computing for Green Oil and Gas Industry

Authors: Razin Farhan Hussain, Mohsen Amini Salehi, Omid Semiari

Abstract: Development of autonomous and self-driving vehicles requires agile and reliable services to manage hazardous road situations. Vehicular Network is the medium that can provide high-quality services for self-driving vehicles. The majority of service requests in Vehicular Networks are delay intolerant (e.g., hazard alerts, lane change warning) and require immediate service. Therefore, Vehicular Netwo… ▽ More Development of autonomous and self-driving vehicles requires agile and reliable services to manage hazardous road situations. Vehicular Network is the medium that can provide high-quality services for self-driving vehicles. The majority of service requests in Vehicular Networks are delay intolerant (e.g., hazard alerts, lane change warning) and require immediate service. Therefore, Vehicular Networks, and particularly, Vehicle-to-Infrastructure (V2I) systems must provide a consistent real-time response to autonomous vehicles. During peak hours or disasters, when a surge of requests arrives at a Base Station, it is challenging for the V2I system to maintain its performance, which can lead to hazardous consequences. Hence, the goal of this research is to develop a V2I system that is robust against uncertain request arrivals. To achieve this goal, we propose to dynamically allocate service requests among Base Stations. We develop an uncertainty-aware resource allocation method for the federated environment that assigns arriving requests to a Base Station so that the likelihood of completing it on-time is maximized. We evaluate the system under various workload conditions and oversubscription levels. Simulation results show that edge federation can improve robustness of the V2I system by reducing the overall service miss rate by up to 45%. △ Less

Submitted 11 May, 2019; originally announced May 2019.

arXiv:1905.04459 [pdf, other]

F-FDN: Federation of Fog Computing Systems for Low Latency Video Streaming

Authors: Vaughan Veillon, Chavit Denninnart, Mohsen Amini Salehi

Abstract: Video streaming is growing in popularity and has become the most bandwidth-consuming Internet service. As such, robust streaming in terms of low latency and uninterrupted streaming experience, particularly for viewers in distant areas, has become a challenge. The common practice to reduce latency is to pre-process multiple versions of each video and use Content Delivery Networks (CDN) to cache vid… ▽ More Video streaming is growing in popularity and has become the most bandwidth-consuming Internet service. As such, robust streaming in terms of low latency and uninterrupted streaming experience, particularly for viewers in distant areas, has become a challenge. The common practice to reduce latency is to pre-process multiple versions of each video and use Content Delivery Networks (CDN) to cache videos that are popular in a geographical area. However, with the fast-growing video repository sizes, caching video contents in multiple versions on each CDN is becoming inefficient. Accordingly, in this paper, we propose the architecture for Fog Delivery Networks (FDN) and provide methods to federate them (called F-FDN) to reduce video streaming latency. In addition to caching, FDNs have the ability to process videos in an on-demand manner. F-FDN leverages cached contents on the neighboring FDNs to further reduce latency. In particular, F-FDN is equipped with methods that aim at reducing latency through probabilistically evaluating the cost benefit of fetching video segments either from neighboring FDNs or by processing them. Experimental results against alternative streaming methods show that both on-demand processing and leveraging cached video segments on neighboring FDNs can remarkably reduce streaming latency (on average 52%). △ Less

Submitted 11 May, 2019; originally announced May 2019.

Comments: 3rd IEEE International Conference on Fog and Edge Computing (ICFEC 2019)

arXiv:1905.04458 [pdf, other]

Robust Resource Allocation Using Edge Computing for Vehicle to Infrastructure (V2I) Networks

Authors: Anna Kovalenko, Razin Farhan Hussain, Omid Semiari, Mohsen Amini Salehi

Abstract: Development of autonomous and self-driving vehicles requires agile and reliable services to manage hazardous road situations. Vehicular Network is the medium that can provide high-quality services for self-driving vehicles. The majority of service requests in Vehicular Networks are delay intolerant (e.g., hazard alerts, lane change warning) and require immediate service. Therefore, Vehicular Netwo… ▽ More Development of autonomous and self-driving vehicles requires agile and reliable services to manage hazardous road situations. Vehicular Network is the medium that can provide high-quality services for self-driving vehicles. The majority of service requests in Vehicular Networks are delay intolerant (e.g., hazard alerts, lane change warning) and require immediate service. Therefore, Vehicular Networks, and particularly, Vehicle-to-Infrastructure (V2I) systems must provide a consistent real-time response to autonomous vehicles. During peak hours or disasters, when a surge of requests arrives at a Base Station, it is challenging for the V2I system to maintain its performance, which can lead to hazardous consequences. Hence, the goal of this research is to develop a V2I system that is robust against uncertain request arrivals. To achieve this goal, we propose to dynamically allocate service requests among Base Stations. We develop an uncertainty-aware resource allocation method for the federated environment that assigns arriving requests to a Base Station so that the likelihood of completing it on-time is maximized. We evaluate the system under various workload conditions and oversubscription levels. Simulation results show that edge federation can improve robustness of the V2I system by reducing the overall service miss rate by up to 45%. △ Less

Submitted 11 May, 2019; originally announced May 2019.

Journal ref: 3rd IEEE International Conference on Fog and Edge Computing (ICFEC 2019)

arXiv:1905.04456 [pdf, other]

Improving Robustness of Heterogeneous Serverless Computing Systems Via Probabilistic Task Pruning

Authors: Chavit Denninnart, James Gentry, Mohsen Amini Salehi

Abstract: Cloud-based serverless computing is an increasingly popular computing paradigm. In this paradigm, different services have diverse computing requirements that justify deploying an inconsistently Heterogeneous Computing (HC) system to efficiently process them. In an inconsistently HC system, each task needed for a given service, potentially exhibits different execution times on each type of machine.… ▽ More Cloud-based serverless computing is an increasingly popular computing paradigm. In this paradigm, different services have diverse computing requirements that justify deploying an inconsistently Heterogeneous Computing (HC) system to efficiently process them. In an inconsistently HC system, each task needed for a given service, potentially exhibits different execution times on each type of machine. An ideal resource allocation system must be aware of such uncertainties in execution times and be robust against them, so that Quality of Service (QoS) requirements of users are met. This research aims to maximize the robustness of an HC system utilized to offer a serverless computing system, particularly when the system is oversubscribed. Our strategy to maximize robustness is to develop a task pruning mechanism that can be added to existing task-mapping heuristics without altering them. Pruning tasks with a low probability of meeting their deadlines improves the likelihood of other tasks meeting their deadlines, thereby increasing system robustness and overall QoS. To evaluate the impact of the pruning mechanism, we examine it on various configurations of heterogeneous and homogeneous computing systems. Evaluation results indicate a considerable improvement (up to 35%) in the system robustness. △ Less

Submitted 11 May, 2019; originally announced May 2019.

Comments: IPDPSW '19

arXiv:1901.09312 [pdf, other]

Robust Dynamic Resource Allocation via Probabilistic Task Pruning in Heterogeneous Computing Systems

Authors: James Gentry, Chavit Denninnart, Mohsen Amini Salehi

Abstract: In heterogeneous distributed computing (HC) systems, diversity can exist in both computational resources and arriving tasks. In an inconsistently heterogeneous computing system, task types have different execution times on heterogeneous machines. A method is required to map arriving tasks to machines based on machine availability and performance, maximizing the number of tasks meeting deadlines (d… ▽ More In heterogeneous distributed computing (HC) systems, diversity can exist in both computational resources and arriving tasks. In an inconsistently heterogeneous computing system, task types have different execution times on heterogeneous machines. A method is required to map arriving tasks to machines based on machine availability and performance, maximizing the number of tasks meeting deadlines (defined as robustness). For tasks with hard deadlines (eg those in live video streaming), tasks that miss their deadlines are dropped. The problem investigated in this research is maximizing the robustness of an oversubscribed HC system. A way to maximize this robustness is to prune (ie defer or drop) tasks with low probability of meeting their deadlines to increase the probability of other tasks meeting their deadlines. In this paper, we first provide a mathematical model to estimate a task's probability of meeting its deadline in the presence of task dropping. We then investigate methods for engaging probabilistic dropping and we find thresholds for dropping and deferring. Next, we develop a pruning-aware mapping heuristic and extend it to engender fairness across various task types. We show the cost benefit of using probabilistic pruning in an HC system. Simulation results, harnessing a selection of mapping heuristics, show efficacy of the pruning mechanism in improving robustness (on average by 25%) and cost in an oversubscribed HC system by up to 40%. △ Less

Submitted 26 January, 2019; originally announced January 2019.

Journal ref: 33rd IEEE International Parallel & Distributed Processing Symposium, 2019, (IPDPS '19)

arXiv:1811.09767 [pdf, other]

Survey on Secure Search Over Encrypted Data on the Cloud

Authors: Hoang Pham, Jason Woodworth, Mohsen Amini Salehi

Abstract: Cloud computing has become a potential resource for businesses and individuals to outsource their data to remote but highly accessible servers. However, potentials of the cloud services have not been fully unleashed due to users' concerns about security and privacy of their data in the cloud. User-side encryption techniques can be employed to mitigate the security concerns. Nonetheless, once the d… ▽ More Cloud computing has become a potential resource for businesses and individuals to outsource their data to remote but highly accessible servers. However, potentials of the cloud services have not been fully unleashed due to users' concerns about security and privacy of their data in the cloud. User-side encryption techniques can be employed to mitigate the security concerns. Nonetheless, once the data in encrypted, no processing (e.g., searching) can be performed on the outsourced data. Searchable Encryption (SE) techniques have been widely studied to enable searching on the data while they are encrypted. These techniques enable various types of search on the encrypted data and offer different levels of security. In addition, although these techniques enable different search types and vary in details, they share similarities in their components and architectures. In this paper, we provide a comprehensive survey on different secure search techniques; a high-level architecture for these systems, and an analysis of their performance and security level. △ Less

Submitted 24 November, 2018; originally announced November 2018.

arXiv:1809.07927 [pdf, other]

S3BD: Secure Semantic Search over Encrypted Big Data in the Cloud

Authors: Jason Woodworth, Mohsen Amini Salehi

Abstract: Cloud storage is a widely utilized service for both personal and enterprise demands. However, despite its advantages, many potential users with enormous amounts of sensitive data (big data) refrain from fully utilizing the cloud storage service due to valid concerns about data privacy. An established solution to the cloud data privacy problem is to perform encryption on the client-end. This approa… ▽ More Cloud storage is a widely utilized service for both personal and enterprise demands. However, despite its advantages, many potential users with enormous amounts of sensitive data (big data) refrain from fully utilizing the cloud storage service due to valid concerns about data privacy. An established solution to the cloud data privacy problem is to perform encryption on the client-end. This approach, however, restricts data processing capabilities (eg, searching over the data). Accordingly, the research problem we investigate is how to enable real-time searching over the encrypted big data in the cloud. In particular, semantic search is of interest to clients dealing with big data. To address this problem, in this research, we develop a system (termed S3BD) for searching big data using cloud services without exposing any data to cloud providers. To keep real-time response on big data, S3BD proactively prunes the search space to a subset of the whole dataset. For that purpose, we propose a method to cluster the encrypted data. An abstract of each cluster is maintained on the client-end to navigate the search operation to appropriate clusters at the search time. Results of experiments, carried out on real-world big datasets, demonstrate that the search operation can be achieved in real-time and is significantly more efficient than other counterparts. In addition, a fully functional prototype of S3BD is made publicly available. △ Less

Submitted 20 September, 2018; originally announced September 2018.

Journal ref: CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE (CCPE), Sep. 2018

arXiv:1809.06536 [pdf, other]

Leveraging Computational Reuse for Cost- and QoS-Efficient Task Scheduling in Clouds

Authors: Chavit Denninnart, Mohsen Amini Salehi, Adel Nadjaran Toosi, Xiangbo Li

Abstract: Cloud-based computing systems could get oversubscribed due to budget constraints of cloud users which causes violation of Quality of Experience(QoE) metrics such as tasks' deadlines. We investigate an approach to achieve robustness against uncertain task arrival and oversubscription through smart reuse of computation while similar tasks are waiting for execution. Our motivation in this study is a… ▽ More Cloud-based computing systems could get oversubscribed due to budget constraints of cloud users which causes violation of Quality of Experience(QoE) metrics such as tasks' deadlines. We investigate an approach to achieve robustness against uncertain task arrival and oversubscription through smart reuse of computation while similar tasks are waiting for execution. Our motivation in this study is a cloud-based video streaming engine that processes video streaming tasks in an on-demand manner. We propose a mechanism to identify various types of "mergeable" tasks and determine when it is appropriate to aggregate tasks without affecting QoS of other tasks. Experiment shows that our mechanism can improve robustness of the system and also saves the overall time of using cloud services by more than 14%. △ Less

Submitted 18 September, 2018; originally announced September 2018.

Comments: 8 pages

Journal ref: In the 16th International Conference on Service-Oriented Computing, 2018

arXiv:1809.06529 [pdf, other]

Performance Analysis and Modeling of Video Transcoding Using Heterogeneous Cloud Services

Authors: Xiangbo Li, Mohsen Amini Salehi, Yamini Joshi, Mahmoud Darwich, Brad Landreneau, Magdy Bayoumi

Abstract: High-quality video streaming, either in form of Video-On-Demand (VOD) or live streaming, usually requires converting (ie, transcoding) video streams to match the characteristics of viewers' devices (eg, in terms of spatial resolution or supported formats). Considering the computational cost of the transcoding operation and the surge in video streaming demands, Streaming Service Providers (SSPs) ar… ▽ More High-quality video streaming, either in form of Video-On-Demand (VOD) or live streaming, usually requires converting (ie, transcoding) video streams to match the characteristics of viewers' devices (eg, in terms of spatial resolution or supported formats). Considering the computational cost of the transcoding operation and the surge in video streaming demands, Streaming Service Providers (SSPs) are becoming reliant on cloud services to guarantee Quality of Service (QoS) of streaming for their viewers. Cloud providers offer heterogeneous computational services in form of different types of Virtual Machines (VMs) with diverse prices. Effective utilization of cloud services for video transcoding requires detailed performance analysis of different video transcoding operations on the heterogeneous cloud VMs. In this research, for the first time, we provide a thorough analysis of the performance of the video stream transcoding on heterogeneous cloud VMs. Providing such analysis is crucial for efficient prediction of transcoding time on heterogeneous VMs and for the functionality of any scheduling methods tailored for video transcoding. Based upon the findings of this analysis and by considering the cost difference of heterogeneous cloud VMs, in this research, we also provide a model to quantify the degree of suitability of each cloud VM type for various transcoding tasks. The provided model can supply resource (VM) provisioning methods with accurate performance and cost trade-offs to efficiently utilize cloud services for video streaming. △ Less

Submitted 18 September, 2018; originally announced September 2018.

Comments: 15 pages

Journal ref: IEEE Transactions on Parallel and Distributed Systems (TPDS), Sep. 2018

arXiv:1808.06015 [pdf, other]

Ultra Reliable, Low Latency Vehicle-to-Infrastructure Wireless Communications with Edge Computing

Authors: Md Mostofa Kamal Tareq, Omid Semiari, Mohsen Amini Salehi, Walid Saad

Abstract: Ultra reliable, low latency vehicle-to-infrastructure (V2I) communications is a key requirement for seamless operation of autonomous vehicles (AVs) in future smart cities. To this end, cellular small base stations (SBSs) with edge computing capabilities can reduce the end-to-end (E2E) service delay by processing requested tasks from AVs locally, without forwarding the tasks to a remote cloud serve… ▽ More Ultra reliable, low latency vehicle-to-infrastructure (V2I) communications is a key requirement for seamless operation of autonomous vehicles (AVs) in future smart cities. To this end, cellular small base stations (SBSs) with edge computing capabilities can reduce the end-to-end (E2E) service delay by processing requested tasks from AVs locally, without forwarding the tasks to a remote cloud server. Nonetheless, due to the limited computational capabilities of the SBSs, coupled with the scarcity of the wireless bandwidth resources, minimizing the E2E latency for AVs and achieving a reliable V2I network is challenging. In this paper, a novel algorithm is proposed to jointly optimize AVs-to-SBSs association and bandwidth allocation to maximize the reliability of the V2I network. By using tools from labor matching markets, the proposed framework can effectively perform distributed association of AVs to SBSs, while accounting for the latency needs of AVs as well as the limited computational and bandwidth resources of SBSs. Moreover, the convergence of the proposed algorithm to a core allocation between AVs and SBSs is proved and its ability to capture interdependent computational and transmission latencies for AVs in a V2I network is characterized. Simulation results show that by optimizing the E2E latency, the proposed algorithm substantially outperforms conventional cell association schemes, in terms of service reliability and latency. △ Less

Submitted 17 August, 2018; originally announced August 2018.

Comments: Proc. of IEEE Global Communications Conference (GLOBECOM), Mobile andWireless Networks Symposium

arXiv:1711.01008 [pdf, other]

doi 10.1109/TPDS.2017.2766069

Cost-Efficient and Robust On-Demand Video Transcoding Using Heterogeneous Cloud Services

Authors: Xiangbo Li, Mohsen Amini Salehi, Magdy Bayoumi, Nian-Feng Tzeng, Rajkumar Buyya

Abstract: Video streams usually have to be transcoded to match the characteristics of viewers' devices. Streaming providers have to store numerous transcoded versions of a given video to serve various display devices. Given the fact that viewers' access pattern to video streams follows a long tail distribution, for the video streams with low access rate, we propose to transcode them in an on-demand manner u… ▽ More Video streams usually have to be transcoded to match the characteristics of viewers' devices. Streaming providers have to store numerous transcoded versions of a given video to serve various display devices. Given the fact that viewers' access pattern to video streams follows a long tail distribution, for the video streams with low access rate, we propose to transcode them in an on-demand manner using cloud computing services. The challenge in utilizing cloud services for on-demand video transcoding is to maintain a robust QoS for viewers and cost-efficiency for streaming service providers. To address this challenge, we present the Cloud-based Video Streaming Services (CVS2) architecture. It includes a QoS-aware scheduling that maps transcoding tasks to the VMs by considering the affinity of the transcoding tasks with the allocated heterogeneous VMs. To maintain robustness in the presence of varying streaming requests, the architecture includes a cost-efficient VM Provisioner. This component provides a self- configurable cluster of heterogeneous VMs. The cluster is reconfigured dynamically to maintain the maximum affinity with the arriving workload. Results obtained under diverse workload conditions demonstrate that CVS2 architecture can maintain a robust QoS for viewers while reducing the incurred cost of the streaming service provider up to 85% △ Less

Submitted 2 November, 2017; originally announced November 2017.

Comments: IEEE Transactions on Parallel and Distributed Systems

Showing 1–44 of 44 results for author: Salehi, M A