-
Efficient Training Approaches for Performance Anomaly Detection Models in Edge Computing Environments
Authors:
Duneesha Fernando,
Maria A. Rodriguez,
Patricia Arroba,
Leila Ismail,
Rajkumar Buyya
Abstract:
Microservice architectures are increasingly used to modularize IoT applications and deploy them in distributed and heterogeneous edge computing environments. Over time, these microservice-based IoT applications are susceptible to performance anomalies caused by resource hogging (e.g., CPU or memory), resource contention, etc., which can negatively impact their Quality of Service and violate their…
▽ More
Microservice architectures are increasingly used to modularize IoT applications and deploy them in distributed and heterogeneous edge computing environments. Over time, these microservice-based IoT applications are susceptible to performance anomalies caused by resource hogging (e.g., CPU or memory), resource contention, etc., which can negatively impact their Quality of Service and violate their Service Level Agreements. Existing research on performance anomaly detection for edge computing environments focuses on model training approaches that either achieve high accuracy at the expense of a time-consuming and resource-intensive training process or prioritize training efficiency at the cost of lower accuracy. To address this gap, while considering the resource constraints and the large number of devices in modern edge platforms, we propose two clustering-based model training approaches : (1) intra-cluster parameter transfer learning-based model training (ICPTL) and (2) cluster-level model training (CM). These approaches aim to find a trade-off between the training efficiency of anomaly detection models and their accuracy. We compared the models trained under ICPTL and CM to models trained for specific devices (most accurate, least efficient) and a single general model trained for all devices (least accurate, most efficient). Our findings show that the model accuracy of ICPTL is comparable to that of the model per device approach while requiring only 40% of the training time. In addition, CM further improves training efficiency by requiring 23% less training time and reducing the number of trained models by approximately 66% compared to ICPTL, yet achieving a higher accuracy than a single general model.
△ Less
Submitted 23 August, 2024;
originally announced August 2024.
-
Enhancing Regression Models for Complex Systems Using Evolutionary Techniques for Feature Engineering
Authors:
Patricia Arroba,
José L. Risco-Martín,
Marina Zapater,
José M. Moya,
José L. Ayala
Abstract:
This work proposes an automatic methodology for modeling complex systems. Our methodology is based on the combination of Grammatical Evolution and classical regression to obtain an optimal set of features that take part of a linear and convex model. This technique provides both Feature Engineering and Symbolic Regression in order to infer accurate models with no effort or designer's expertise requ…
▽ More
This work proposes an automatic methodology for modeling complex systems. Our methodology is based on the combination of Grammatical Evolution and classical regression to obtain an optimal set of features that take part of a linear and convex model. This technique provides both Feature Engineering and Symbolic Regression in order to infer accurate models with no effort or designer's expertise requirements. As advanced Cloud services are becoming mainstream, the contribution of data centers in the overall power consumption of modern cities is growing dramatically. These facilities consume from 10 to 100 times more power per square foot than typical office buildings. Modeling the power consumption for these infrastructures is crucial to anticipate the effects of aggressive optimization policies, but accurate and fast power modeling is a complex challenge for high-end servers not yet satisfied by analytical approaches. For this case study, our methodology minimizes error in power prediction. This work has been tested using real Cloud applications resulting on an average error in power estimation of 3.98%. Our work improves the possibilities of deriving Cloud energy efficient policies in Cloud data centers being applicable to other computing environments with similar characteristics.
△ Less
Submitted 14 March, 2024;
originally announced July 2024.
-
Green Adaptation of Real-Time Web Services for Industrial CPS within a Cloud Environment
Authors:
Teresa Higuera,
José L. Risco-Martín,
Patricia Arroba,
José L. Ayala
Abstract:
Managing energy efficiency under timing constraints is an interesting and big challenge. This work proposes an accurate power model in data centers for time-constrained servers in Cloud computing. This model, as opposed to previous approaches, does not only consider the workload assigned to the processing element, but also incorporates the need of considering the static power consumption and, even…
▽ More
Managing energy efficiency under timing constraints is an interesting and big challenge. This work proposes an accurate power model in data centers for time-constrained servers in Cloud computing. This model, as opposed to previous approaches, does not only consider the workload assigned to the processing element, but also incorporates the need of considering the static power consumption and, even more interestingly, its dependency with temperature. The proposed model has been used in a multi-objective optimization environment in which the Dynamic Voltage and Frequency Scaling (DVFS) and workload assignment have been efficiently optimized.
△ Less
Submitted 29 January, 2024;
originally announced January 2024.
-
Heuristics and Metaheuristics for Dynamic Management of Computing and Cooling Energy in Cloud Data Centers
Authors:
Patricia Arroba,
José L. Risco-Martín,
José M. Moya,
José L. Ayala
Abstract:
Data centers handle impressive high figures in terms of energy consumption, and the growing popularity of Cloud applications is intensifying their computational demand. Moreover, the cooling needed to keep the servers within reliable thermal operating conditions also has an impact on the thermal distribution of the data room, thus affecting to servers' power leakage. Optimizing the energy consumpt…
▽ More
Data centers handle impressive high figures in terms of energy consumption, and the growing popularity of Cloud applications is intensifying their computational demand. Moreover, the cooling needed to keep the servers within reliable thermal operating conditions also has an impact on the thermal distribution of the data room, thus affecting to servers' power leakage. Optimizing the energy consumption of these infrastructures is a major challenge to place data centers on a more scalable scenario. Thus, understanding the relationship between power, temperature, consolidation and performance is crucial to enable an energy-efficient management at the data center level. In this research, we propose novel power and thermal-aware strategies and models to provide joint cooling and computing optimizations from a local perspective based on the global energy consumption of metaheuristic-based optimizations. Our results show that the combined awareness from both metaheuristic and best fit decreasing algorithms allow us to describe the global energy into faster and lighter optimization strategies that may be used during runtime. This approach allows us to improve the energy efficiency of the data center, considering both computing and cooling infrastructures, in up to a 21.74\% while maintaining quality of service.
△ Less
Submitted 17 December, 2023;
originally announced December 2023.
-
Mercury: A modeling, simulation, and optimization framework for data stream-oriented IoT applications
Authors:
Román Cárdenas,
Patricia Arroba,
Roberto Blanco,
Pedro Malagón,
José L. Risco-Martín,
José M. Moya
Abstract:
The Internet of Things is transforming our society by monitoring users and infrastructures' behavior to enable new services that will improve life quality and resource management. These applications require a vast amount of localized information to be processed in real-time so, the deployment of new fog computing infrastructures that bring computing closer to the data sources is a major concern. I…
▽ More
The Internet of Things is transforming our society by monitoring users and infrastructures' behavior to enable new services that will improve life quality and resource management. These applications require a vast amount of localized information to be processed in real-time so, the deployment of new fog computing infrastructures that bring computing closer to the data sources is a major concern. In this context, we present Mercury, a Modeling, Simulation, and Optimization (M&S&O) framework to analyze the dimensioning and the dynamic operation of real-time fog computing scenarios. Our research proposes a location-aware solution that supports data stream analytics applications including FaaS-based computation offloading. Mercury implements a detailed structural and behavioral simulation model, providing fine-grained simulation outputs, and is described using the Discrete Event System Specification (DEVS) mathematical formalism, helping to validate the model's implementation. Finally, we present a case study using real traces from a driver assistance scenario, offering a detailed comparison with other state-of-the-art simulators.
△ Less
Submitted 2 November, 2023;
originally announced December 2023.
-
The DEVStone Metric: Performance Analysis of DEVS Simulation Engines
Authors:
Román Cárdenas,
Kevin Henares,
Patricia Arroba,
José L. Risco-Martín,
Gabriel A. Wainer
Abstract:
The DEVStone benchmark allows us to evaluate the performance of discrete-event simulators based on the DEVS formalism. It provides model sets with different characteristics, enabling the analysis of specific issues of simulation engines. However, this heterogeneity hinders the comparison of the results among studies, as the results obtained on each research work depend on the chosen subset of DEVS…
▽ More
The DEVStone benchmark allows us to evaluate the performance of discrete-event simulators based on the DEVS formalism. It provides model sets with different characteristics, enabling the analysis of specific issues of simulation engines. However, this heterogeneity hinders the comparison of the results among studies, as the results obtained on each research work depend on the chosen subset of DEVStone models. We define the DEVStone metric based on the DEVStone synthetic benchmark and provide a mechanism for specifying objective ratings for DEVS-based simulators. This metric corresponds to the average number of times that a simulator can execute a selection of 12 DEVStone models in one minute. The variety of the chosen models ensures we measure different particularities provided by DEVStone. The proposed metric allows us to compare various simulators and to assess the impact of new features on their performance. We use the DEVStone metric to compare some popular DEVS-based simulators.
△ Less
Submitted 28 September, 2023;
originally announced September 2023.
-
Bringing AI to the edge: A formal M&S specification to deploy effective IoT architectures
Authors:
Román Cárdenas,
Patricia Arroba,
José L. Risco-Martín
Abstract:
The Internet of Things is transforming our society, providing new services that improve the quality of life and resource management. These applications are based on ubiquitous networks of multiple distributed devices, with limited computing resources and power, capable of collecting and storing data from heterogeneous sources in real-time. To avoid network saturation and high delays, new architect…
▽ More
The Internet of Things is transforming our society, providing new services that improve the quality of life and resource management. These applications are based on ubiquitous networks of multiple distributed devices, with limited computing resources and power, capable of collecting and storing data from heterogeneous sources in real-time. To avoid network saturation and high delays, new architectures such as fog computing are emerging to bring computing infrastructure closer to data sources. Additionally, new data centers are needed to provide real-time Big Data and data analytics capabilities at the edge of the network, where energy efficiency needs to be considered to ensure a sustainable and effective deployment in areas of human activity. In this research, we present an IoT model based on the principles of Model-Based Systems Engineering defined using the Discrete Event System Specification formalism. The provided mathematical formalism covers the description of the entire architecture, from IoT devices to the processing units in edge data centers. Our work includes the location-awareness of user equipment, network, and computing infrastructures to optimize federated resource management in terms of delay and power consumption. We present an effective framework to assist the dimensioning and the dynamic operation of IoT data stream analytics applications, demonstrating our contributions through a driving assistance use case based on real traces and data.
△ Less
Submitted 11 May, 2023;
originally announced May 2023.
-
Sustainable Edge Computing: Challenges and Future Directions
Authors:
Patricia Arroba,
Rajkumar Buyya,
Román Cárdenas,
José L. Risco-Martín,
José M. Moya
Abstract:
An increasing amount of data is being injected into the network from IoT (Internet of Things) applications. Many of these applications, developed to improve society's quality of life, are latency-critical and inject large amounts of data into the network. These requirements of IoT applications trigger the emergence of Edge computing paradigm. Currently, data centers are responsible for a global en…
▽ More
An increasing amount of data is being injected into the network from IoT (Internet of Things) applications. Many of these applications, developed to improve society's quality of life, are latency-critical and inject large amounts of data into the network. These requirements of IoT applications trigger the emergence of Edge computing paradigm. Currently, data centers are responsible for a global energy use between 2% and 3%. However, this trend is difficult to maintain, as bringing computing infrastructures closer to the edge of the network comes with its own set of challenges for energy efficiency. In this paper, we propose our approach for the sustainability of future computing infrastructures to provide (i) an energy-efficient and economically viable deployment, (ii) a fault-tolerant automated operation, and (iii) a collaborative resource management to improve resource efficiency. We identify the main limitations of applying Cloud-based approaches close to the data sources and present the research challenges to Edge sustainability arising from these constraints. We propose two-phase immersion cooling, formal modeling, machine learning, and energy-centric federated management as Edge-enabling technologies. We present our early results towards the sustainability of an Edge infrastructure to demonstrate the benefits of our approach for future computing environments and deployments.
△ Less
Submitted 10 April, 2023;
originally announced April 2023.
-
Energy-Efficiency and Sustainability in New Generation Cloud Computing: A Vision and Directions for Integrated Management of Data Centre Resources and Workloads
Authors:
Rajkumar Buyya,
Shashikant Ilager,
Patricia Arroba
Abstract:
Cloud computing has become a critical infrastructure for modern society, like electric power grids and roads. As the backbone of the modern economy, it offers subscription-based computing services anytime, anywhere, on a pay-as-you-go basis. Its use is growing exponentially with the continued development of new classes of applications driven by a huge number of emerging networked devices. However,…
▽ More
Cloud computing has become a critical infrastructure for modern society, like electric power grids and roads. As the backbone of the modern economy, it offers subscription-based computing services anytime, anywhere, on a pay-as-you-go basis. Its use is growing exponentially with the continued development of new classes of applications driven by a huge number of emerging networked devices. However, the success of Cloud computing has created a new global energy challenge, as it comes at the cost of vast energy usage. Currently, data centres hosting Cloud services world-wide consume more energy than most countries. Globally, by 2025, they are projected to consume 20% of global electricity and emit up to 5.5% of the world's carbon emissions. In addition, a significant part of the energy consumed is transformed into heat which leads to operational problems, including a reduction in system reliability and the life expectancy of devices, and escalation in cooling requirements. Therefore, for future generations of Cloud computing to address the environmental and operational consequences of such significant energy usage, they must become energy-efficient and environmentally sustainable while continuing to deliver high-quality services.
In this paper, we propose a vision for learning-centric approach for the integrated management of new generation Cloud computing environments to reduce their energy consumption and carbon footprint while delivering service quality guarantees. In this paper, we identify the dimensions and key issues of integrated resource management and our envisioned approaches to address them. We present a conceptual architecture for energy-efficient new generation Clouds and early results on the integrated management of resources and workloads that evidence its potential benefits towards energy efficiency and sustainability.
△ Less
Submitted 20 July, 2023; v1 submitted 19 March, 2023;
originally announced March 2023.
-
Runtime data center temperature prediction using Grammatical Evolution techniques
Authors:
Marina Zapater,
José L. Risco-Martín,
Patricia Arroba,
José L. Ayala,
José M. Moya,
Román Hermida
Abstract:
Data Centers are huge power consumers, both because of the energy required for computation and the cooling needed to keep servers below thermal redlining. The most common technique to minimize cooling costs is increasing data room temperature. However, to avoid reliability issues, and to enhance energy efficiency, there is a need to predict the temperature attained by servers under variable coolin…
▽ More
Data Centers are huge power consumers, both because of the energy required for computation and the cooling needed to keep servers below thermal redlining. The most common technique to minimize cooling costs is increasing data room temperature. However, to avoid reliability issues, and to enhance energy efficiency, there is a need to predict the temperature attained by servers under variable cooling setups. Due to the complex thermal dynamics of data rooms, accurate runtime data center temperature prediction has remained as an important challenge. By using Gramatical Evolution techniques, this paper presents a methodology for the generation of temperature models for data centers and the runtime prediction of CPU and inlet temperature under variable cooling setups. As opposed to time costly Computational Fluid Dynamics techniques, our models do not need specific knowledge about the problem, can be used in arbitrary data centers, re-trained if conditions change and have negligible overhead during runtime prediction. Our models have been trained and tested by using traces from real Data Center scenarios. Our results show how we can fully predict the temperature of the servers in a data rooms, with prediction errors below 2 C and 0.5 C in CPU and server inlet temperature respectively.
△ Less
Submitted 11 November, 2022;
originally announced November 2022.
-
Data augmentation through multivariate scenario forecasting in Data Centers using Generative Adversarial Networks
Authors:
Jaime Pérez,
Patricia Arroba,
José M. Moya
Abstract:
The Cloud paradigm is at a critical point in which the existing energy-efficiency techniques are reaching a plateau, while the computing resources demand at Data Center facilities continues to increase exponentially. The main challenge in achieving a global energy efficiency strategy based on Artificial Intelligence is that we need massive amounts of data to feed the algorithms. This paper propose…
▽ More
The Cloud paradigm is at a critical point in which the existing energy-efficiency techniques are reaching a plateau, while the computing resources demand at Data Center facilities continues to increase exponentially. The main challenge in achieving a global energy efficiency strategy based on Artificial Intelligence is that we need massive amounts of data to feed the algorithms. This paper proposes a time-series data augmentation methodology based on synthetic scenario forecasting within the Data Center. For this purpose, we will implement a powerful generative algorithm: Generative Adversarial Networks (GANs). Specifically, our work combines the disciplines of GAN-based data augmentation and scenario forecasting, filling the gap in the generation of synthetic data in DCs. Furthermore, we propose a methodology to increase the variability and heterogeneity of the generated data by introducing on-demand anomalies without additional effort or expert knowledge. We also suggest the use of Kullback-Leibler Divergence and Mean Squared Error as new metrics in the validation of synthetic time series generation, as they provide a better overall comparison of multivariate data distributions. We validate our approach using real data collected in an operating Data Center, successfully generating synthetic data helpful for prediction and optimization models. Our research will help optimize the energy consumed in Data Centers, although the proposed methodology can be employed in any similar time-series-like problem.
△ Less
Submitted 29 March, 2022; v1 submitted 12 January, 2022;
originally announced January 2022.