-
Input-Based Ensemble-Learning Method for Dynamic Memory Configuration of Serverless Computing Functions
Authors:
Siddharth Agarwal,
Maria A. Rodriguez,
Rajkumar Buyya
Abstract:
In today's Function-as-a-Service offerings, a programmer is usually responsible for configuring function memory for its successful execution, which allocates proportional function resources such as CPU and network. However, right-sizing the function memory force developers to speculate performance and make ad-hoc configuration decisions. Recent research has highlighted that a function's input char…
▽ More
In today's Function-as-a-Service offerings, a programmer is usually responsible for configuring function memory for its successful execution, which allocates proportional function resources such as CPU and network. However, right-sizing the function memory force developers to speculate performance and make ad-hoc configuration decisions. Recent research has highlighted that a function's input characteristics, such as input size, type and number of inputs, significantly impact its resource demand, run-time performance and costs with fluctuating workloads. This correlation further makes memory configuration a non-trivial task. On that account, an input-aware function memory allocator not only improves developer productivity by completely hiding resource-related decisions but also drives an opportunity to reduce resource wastage and offer a finer-grained cost-optimised pricing scheme. Therefore, we present MemFigLess, a serverless solution that estimates the memory requirement of a serverless function with input-awareness. The framework executes function profiling in an offline stage and trains a multi-output Random Forest Regression model on the collected metrics to invoke input-aware optimal configurations. We evaluate our work with the state-of-the-art approaches on AWS Lambda service to find that MemFigLess is able to capture the input-aware resource relationships and allocate upto 82% less resources and save up to 87% run-time costs.
△ Less
Submitted 11 November, 2024;
originally announced November 2024.
-
On-demand Cold Start Frequency Reduction with Off-Policy Reinforcement Learning in Serverless Computing
Authors:
Siddharth Agarwal,
Maria A. Rodriguez,
Rajkumar Buyya
Abstract:
Function-as-a-Service (FaaS) is a cloud computing paradigm offering an event-driven execution model to applications. It features serverless attributes by eliminating resource management responsibilities from developers, and offers transparent and on-demand scalability of applications. To provide seamless on-demand scalability, new function instances are prepared to serve the incoming workload in t…
▽ More
Function-as-a-Service (FaaS) is a cloud computing paradigm offering an event-driven execution model to applications. It features serverless attributes by eliminating resource management responsibilities from developers, and offers transparent and on-demand scalability of applications. To provide seamless on-demand scalability, new function instances are prepared to serve the incoming workload in the absence or unavailability of function instances. However, FaaS platforms are known to suffer from cold starts, where this function provisioning process introduces a non-negligible delay in function response and reduces the end-user experience. Therefore, the presented work focuses on reducing the frequent, on-demand cold starts on the platform by using Reinforcement Learning(RL). The proposed approach uses model-free Q-learning that consider function metrics such as CPU utilization, existing function instances, and response failure rate, to proactively initialize functions, in advance, based on the expected demand. The proposed solution is implemented on Kubeless and evaluated using an open-source function invocation trace applied to a matrix multiplication function. The evaluation results demonstrate a favourable performance of the RL-based agent when compared to Kubeless' default policy and a function keep-alive policy by improving throughput by up to 8.81% and reducing computation load and resource wastage by up to 55% and 37%, respectively, that is a direct outcome of reduced cold starts.
△ Less
Submitted 12 November, 2024; v1 submitted 14 August, 2023;
originally announced August 2023.
-
A Deep Recurrent-Reinforcement Learning Method for Intelligent AutoScaling of Serverless Functions
Authors:
Siddharth Agarwal,
Maria A. Rodriguez,
Rajkumar Buyya
Abstract:
FaaS introduces a lightweight, function-based cloud execution model that finds its relevance in a range of applications like IoT-edge data processing and anomaly detection. While cloud service providers offer a near-infinite function elasticity, these applications often experience fluctuating workloads and stricter performance constraints. A typical CSP strategy is to empirically determine and adj…
▽ More
FaaS introduces a lightweight, function-based cloud execution model that finds its relevance in a range of applications like IoT-edge data processing and anomaly detection. While cloud service providers offer a near-infinite function elasticity, these applications often experience fluctuating workloads and stricter performance constraints. A typical CSP strategy is to empirically determine and adjust desired function instances or resources, known as autoscaling, based on monitoring-based thresholds such as CPU or memory, to cope with demand and performance. However, threshold configuration either requires expert knowledge, historical data or a complete view of the environment, making autoscaling a performance bottleneck that lacks an adaptable solution. RL algorithms are proven to be beneficial in analysing complex cloud environments and result in an adaptable policy that maximizes the expected objectives. Most realistic cloud environments usually involve operational interference and have limited visibility, making them partially observable. A general solution to tackle observability in highly dynamic settings is to integrate Recurrent units with model-free RL algorithms and model a decision process as a POMDP. Therefore, in this paper, we investigate model-free Recurrent RL agents for function autoscaling and compare them against the model-free PPO algorithm. We explore the integration of a LSTM network with the state-of-the-art PPO algorithm to find that under our experimental and evaluation settings, recurrent policies were able to capture the environment parameters and show promising results for function autoscaling. We further compare a PPO-based autoscaling agent with commercially used threshold-based function autoscaling and posit that a LSTM-based autoscaling agent is able to improve throughput by 18%, function execution by 13% and account for 8.4% more function instances.
△ Less
Submitted 11 November, 2024; v1 submitted 11 August, 2023;
originally announced August 2023.
-
High-Availability Clusters: A Taxonomy, Survey, and Future Directions
Authors:
Premathas Somasekaram,
Radu Calinescu,
Rajkumar Buyya
Abstract:
The delivery of key services in domains ranging from finance and manufacturing to healthcare and transportation is underpinned by a rapidly growing number of mission-critical enterprise applications. Ensuring the continuity of these complex applications requires the use of software-managed infrastructures called high-availability clusters (HACs). HACs employ sophisticated techniques to monitor the…
▽ More
The delivery of key services in domains ranging from finance and manufacturing to healthcare and transportation is underpinned by a rapidly growing number of mission-critical enterprise applications. Ensuring the continuity of these complex applications requires the use of software-managed infrastructures called high-availability clusters (HACs). HACs employ sophisticated techniques to monitor the health of key enterprise application layers and of the resources they use, and to seamlessly restart or relocate application components after failures. In this paper, we first describe the manifold uses of HACs to protect essential layers of a critical application and present the architecture of high availability clusters. We then propose a taxonomy that covers all key aspects of HACs -- deployment patterns, application areas, types of cluster, topology, cluster management, failure detection and recovery, consistency and integrity, and data synchronisation; and we use this taxonomy to provide a comprehensive survey of the end-to-end software solutions available for the HAC deployment of enterprise applications. Finally, we discuss the limitations and challenges of existing HAC solutions, and we identify opportunities for future research in the area.
△ Less
Submitted 21 September, 2022; v1 submitted 30 September, 2021;
originally announced September 2021.
-
Master Graduation Thesis: A Lightweight and Distributed Container-based Framework
Authors:
Qifan Deng,
Rajkumar Buyya
Abstract:
Edge/Fog computing is a novel computing paradigm that provides resource-limited Internet of Things (IoT) devices with scalable computing and storage resources. Compared to cloud computing, edge/fog servers have fewer resources, but they can be accessed with higher bandwidth and less communication latency. Thus, integrating edge/fog and cloud infrastructures can support the execution of diverse lat…
▽ More
Edge/Fog computing is a novel computing paradigm that provides resource-limited Internet of Things (IoT) devices with scalable computing and storage resources. Compared to cloud computing, edge/fog servers have fewer resources, but they can be accessed with higher bandwidth and less communication latency. Thus, integrating edge/fog and cloud infrastructures can support the execution of diverse latency-sensitive and computation-intensive IoT applications. Although some frameworks attempt to provide such integration, there are still several challenges to be addressed, such as dynamic scheduling of different IoT applications, scalability mechanisms, multi-platform support, and supporting different interaction models. To overcome these challenges, we propose a lightweight and distributed container-based framework, called FogBus2. It provides a mechanism for scheduling heterogeneous IoT applications and implements several scheduling policies. Also, it proposes an optimized genetic algorithm to obtain fast convergence to well-suited solutions. Besides, it offers a scalability mechanism to ensure efficient responsiveness when either the number of IoT devices increases or the resources become overburdened. Also, the dynamic resource discovery mechanism of FogBus2 assists new entities to quickly join the system. We have also developed two IoT applications, called Conway's Game of Life and Video Optical Character Recognition to demonstrate the effectiveness of FogBus2 for handling real-time and non-real-time IoT applications. Experimental results show FogBus2's scheduling policy improves the response time of IoT applications by 53\% compared to other policies. Also, the scalability mechanism can reduce up to 48\% of the queuing waiting time compared to frameworks that do not support scalability.
△ Less
Submitted 7 August, 2021;
originally announced August 2021.
-
Resource Management in Edge and Fog Computing using FogBus2 Framework
Authors:
Mohammad Goudarzi,
Qifan Deng,
Rajkumar Buyya
Abstract:
Edge/Fog computing is a novel computing paradigm that provides resource-limited Internet of Things (IoT) devices with scalable computing and storage resources. Compared to cloud computing, edge/fog servers have fewer resources, but they can be accessed with higher bandwidth and less communication latency. Thus, integrating edge/fog and cloud infrastructures can support the execution of diverse lat…
▽ More
Edge/Fog computing is a novel computing paradigm that provides resource-limited Internet of Things (IoT) devices with scalable computing and storage resources. Compared to cloud computing, edge/fog servers have fewer resources, but they can be accessed with higher bandwidth and less communication latency. Thus, integrating edge/fog and cloud infrastructures can support the execution of diverse latency-sensitive and computation-intensive IoT applications. Although some frameworks attempt to provide such integration, there are still several challenges to be addressed, such as dynamic scheduling of different IoT applications, scalability mechanisms, multi-platform support, and supporting different interaction models. FogBus2, as a new python-based framework, offers a lightweight and distributed container-based framework to overcome these challenges. In this chapter, we highlight key features of the FogBus2 framework alongside describing its main components. Besides, we provide a step-by-step guideline to set up an integrated computing environment, containing multiple cloud service providers (Hybrid-cloud) and edge devices, which is a prerequisite for any IoT application scenario. To obtain this, a low-overhead communication network among all computing resources is initiated by the provided scripts and configuration files. Next, we provide instructions and corresponding code snippets to install and run the main framework and its integrated applications. Finally, we demonstrate how to implement and integrate several new IoT applications and custom scheduling and scalability policies with the FogBus2 framework.
△ Less
Submitted 1 August, 2021;
originally announced August 2021.
-
A Drone-based Networked System and Methods for Combating Coronavirus Disease (COVID-19) Pandemic
Authors:
Adarsh Kumar,
Kriti Sharma,
Harvinder Singh,
Sagar Gupta Naugriya,
Sukhpal Singh Gill,
Rajkumar Buyya
Abstract:
Coronavirus disease (COVID-19) is an infectious disease caused by a newly discovered coronavirus. It is similar to influenza viruses and raises concerns through alarming levels of spread and severity resulting in an ongoing pandemic worldwide. Within eight months (by August 2020), it infected 24.0 million persons worldwide and over 824 thousand have died. Drones or Unmanned Aerial Vehicles (UAVs)…
▽ More
Coronavirus disease (COVID-19) is an infectious disease caused by a newly discovered coronavirus. It is similar to influenza viruses and raises concerns through alarming levels of spread and severity resulting in an ongoing pandemic worldwide. Within eight months (by August 2020), it infected 24.0 million persons worldwide and over 824 thousand have died. Drones or Unmanned Aerial Vehicles (UAVs) are very helpful in handling the COVID-19 pandemic. This work investigates the drone-based systems, COVID-19 pandemic situations, and proposes an architecture for handling pandemic situations in different scenarios using real-time and simulation-based scenarios. The proposed architecture uses wearable sensors to record the observations in Body Area Networks (BANs) in a push-pull data fetching mechanism. The proposed architecture is found to be useful in remote and highly congested pandemic areas where either the wireless or Internet connectivity is a major issue or chances of COVID-19 spreading are high. It collects and stores the substantial amount of data in a stipulated period and helps to take appropriate action as and when required. In real-time drone-based healthcare system implementation for COVID-19 operations, it is observed that a large area can be covered for sanitization, thermal image collection, and patient identification within a short period (2 KMs within 10 minutes approx.) through aerial route. In the simulation, the same statistics are observed with an addition of collision-resistant strategies working successfully for indoor and outdoor healthcare operations. Further, open challenges are identified and promising research directions are highlighted.
△ Less
Submitted 31 August, 2020; v1 submitted 12 June, 2020;
originally announced June 2020.
-
Application Management in Fog Computing Environments: A Taxonomy, Review and Future Directions
Authors:
Redowan Mahmud,
Kotagiri Ramamohanarao,
Rajkumar Buyya
Abstract:
The Internet of Things (IoT) paradigm is being rapidly adopted for the creation of smart environments in various domains. The IoT-enabled Cyber-Physical Systems (CPSs) associated with smart city, healthcare, Industry 4.0 and Agtech handle a huge volume of data and require data processing services from different types of applications in real-time. The Cloud-centric execution of IoT applications bar…
▽ More
The Internet of Things (IoT) paradigm is being rapidly adopted for the creation of smart environments in various domains. The IoT-enabled Cyber-Physical Systems (CPSs) associated with smart city, healthcare, Industry 4.0 and Agtech handle a huge volume of data and require data processing services from different types of applications in real-time. The Cloud-centric execution of IoT applications barely meets such requirements as the Cloud datacentres reside at a multi-hop distance from the IoT devices. \textit{Fog computing}, an extension of Cloud at the edge network, can execute these applications closer to data sources. Thus, Fog computing can improve application service delivery time and resist network congestion. However, the Fog nodes are highly distributed, heterogeneous and most of them are constrained in resources and spatial sharing. Therefore, efficient management of applications is necessary to fully exploit the capabilities of Fog nodes. In this work, we investigate the existing application management strategies in Fog computing and review them in terms of architecture, placement and maintenance. Additionally, we propose a comprehensive taxonomy and highlight the research gaps in Fog-based application management. We also discuss a perspective model and provide future research directions for further improvement of application management in Fog computing.
△ Less
Submitted 21 May, 2020;
originally announced May 2020.
-
Industrial Internet of Things (IIoT) Applications of Edge and Fog Computing: A Review and Future Directions
Authors:
G. S. S. Chalapathi,
Vinay Chamola,
Aabhaas Vaish,
Rajkumar Buyya
Abstract:
With rapid technological advancements within the domain of Internet of Things (IoT), strong trends have emerged which indicate a rapid growth in the number of smart devices connected to IoT networks and this growth cannot be supported by traditional cloud computing platforms. In response to the increased capacity of data being transferred over networks, the edge and fog computing paradigms have em…
▽ More
With rapid technological advancements within the domain of Internet of Things (IoT), strong trends have emerged which indicate a rapid growth in the number of smart devices connected to IoT networks and this growth cannot be supported by traditional cloud computing platforms. In response to the increased capacity of data being transferred over networks, the edge and fog computing paradigms have emerged as extremely viable frameworks that shift computational and storage resources towards the edge of the network, thereby migrating processing power from centralized cloud servers to distributed LAN resources and powerful embedded devices within the network. These computing paradigms, therefore, have the potential to support massive IoT networks of the future and have also fueled the advancement of IoT systems within industrial settings, leading to the creation of the Industrial Internet of Things (IIoT) technology that is revolutionizing industrial processes in a variety of domains. In this paper, we elaborate on the impact of edge and fog computing paradigms on IIoT. We also highlight the how edge and fog computing are poised to bring about a turnaround in several industrial applications through a use-case approach. Finally, we conclude with the current issues and challenges faced by these paradigms in IIoT and suggest some research directions that should be followed to solve these problems and accelerate the adaptation of edge and fog computing in IIoT.
△ Less
Submitted 2 December, 2019;
originally announced December 2019.
-
iGateLink: A Gateway Library for Linking IoT, Edge, Fog and Cloud Computing Environments
Authors:
Riccardo Mancini,
Shreshth Tuli,
Tommaso Cucinotta,
Rajkumar Buyya
Abstract:
In recent years, the Internet of Things (IoT) has been growing in popularity, along with the increasingly important role played by IoT gateways, mediating the interactions among a plethora of heterogeneous IoT devices and cloud services. In this paper, we present iGateLink, an open-source Android library easing the development of Android applications acting as a gateway between IoT devices and Edg…
▽ More
In recent years, the Internet of Things (IoT) has been growing in popularity, along with the increasingly important role played by IoT gateways, mediating the interactions among a plethora of heterogeneous IoT devices and cloud services. In this paper, we present iGateLink, an open-source Android library easing the development of Android applications acting as a gateway between IoT devices and Edge/Fog/Cloud Computing environments. Thanks to its pluggable design, modules providing connectivity with a number of devices acting as data sources or Fog/Cloud frameworks can be easily reused for different applications. Using iGateLink in two case-studies replicating previous works in the healthcare and image processing domains, the library proved to be effective in adapting to different scenarios and speeding up the development of gateway applications, as compared to the use of conventional methods.
△ Less
Submitted 16 November, 2019;
originally announced November 2019.
-
HealthFog: An Ensemble Deep Learning based Smart Healthcare System for Automatic Diagnosis of Heart Diseases in Integrated IoT and Fog Computing Environments
Authors:
Shreshth Tuli,
Nipam Basumatary,
Sukhpal Singh Gill,
Mohsen Kahani,
Rajesh Chand Arya,
Gurpreet Singh Wander,
Rajkumar Buyya
Abstract:
Cloud computing provides resources over the Internet and allows a plethora of applications to be deployed to provide services for different industries. The major bottleneck being faced currently in these cloud frameworks is their limited scalability and hence inability to cater to the requirements of centralized Internet of Things (IoT) based compute environments. The main reason for this is that…
▽ More
Cloud computing provides resources over the Internet and allows a plethora of applications to be deployed to provide services for different industries. The major bottleneck being faced currently in these cloud frameworks is their limited scalability and hence inability to cater to the requirements of centralized Internet of Things (IoT) based compute environments. The main reason for this is that latency-sensitive applications like health monitoring and surveillance systems now require computation over large amounts of data (Big Data) transferred to centralized database and from database to cloud data centers which leads to drop in performance of such systems. The new paradigms of fog and edge computing provide innovative solutions by bringing resources closer to the user and provide low latency and energy-efficient solutions for data processing compared to cloud domains. Still, the current fog models have many limitations and focus from a limited perspective on either accuracy of results or reduced response time but not both. We proposed a novel framework called HealthFog for integrating ensemble deep learning in Edge computing devices and deployed it for a real-life application of automatic Heart Disease analysis. HealthFog delivers healthcare as a fog service using IoT devices and efficiently manages the data of heart patients, which comes as user requests. Fog-enabled cloud framework, FogBus is used to deploy and test the performance of the proposed model in terms of power consumption, network bandwidth, latency, jitter, accuracy and execution time. HealthFog is configurable to various operation modes that provide the best Quality of Service or prediction accuracy, as required, in diverse fog computation scenarios and for different user requirements.
△ Less
Submitted 15 November, 2019;
originally announced November 2019.