-
RL-TIME: Reinforcement Learning-based Task Replication in Multicore Embedded Systems
Authors:
Roozbeh Siyadatzadeh,
Mohsen Ansari,
Muhammad Shafique,
Alireza Ejlali
Abstract:
Embedded systems power many modern applications and must often meet strict reliability, real-time, thermal, and power requirements. Task replication can improve reliability by duplicating a task's execution to handle transient and permanent faults, but blindly applying replication often leads to excessive overhead and higher temperatures. Existing design-time methods typically choose the number of…
▽ More
Embedded systems power many modern applications and must often meet strict reliability, real-time, thermal, and power requirements. Task replication can improve reliability by duplicating a task's execution to handle transient and permanent faults, but blindly applying replication often leads to excessive overhead and higher temperatures. Existing design-time methods typically choose the number of replicas based on worst-case conditions, which can waste resources under normal operation. In this paper, we present RL-TIME, a reinforcement learning-based approach that dynamically decides the number of replicas according to actual system conditions. By considering both the reliability target and a core-level Thermal Safe Power (TSP) constraint at run-time, RL-TIME adapts the replication strategy to avoid unnecessary overhead and overheating. Experimental results show that, compared to state-of-the-art methods, RL-TIME reduces power consumption by 63%, increases schedulability by 53%, and respects TSP 72% more often.
△ Less
Submitted 16 March, 2025;
originally announced March 2025.
-
GAP: Game Theory-Based Approach for Reliability and Power Management in Emerging Fog Computing
Authors:
Abolfazl Younesi,
Mohsen Ansari,
Alireza Ejlali,
Mohammad Amin Fazli,
Muhammad Shafique,
Jörg Henkel
Abstract:
Fog computing brings about a transformative shift in data management, presenting unprecedented opportunities for enhanced performance and reduced latency. However, one of the key aspects of fog computing revolves around ensuring efficient power and reliability management. To address this challenge, we have introduced a novel model that proposes a non-cooperative game theory-based strategy to strik…
▽ More
Fog computing brings about a transformative shift in data management, presenting unprecedented opportunities for enhanced performance and reduced latency. However, one of the key aspects of fog computing revolves around ensuring efficient power and reliability management. To address this challenge, we have introduced a novel model that proposes a non-cooperative game theory-based strategy to strike a balance between power consumption and reliability in decision-making processes. Our proposed model capitalizes on the Cold Primary/Backup strategy (CPB) to guarantee reliability target by re-executing tasks to different nodes when a fault occurs, while also leveraging Dynamic Voltage and Frequency Scaling (DVFS) to reduce power consumption during task execution and maximizing overall efficiency. Non-cooperative game theory plays a pivotal role in our model, as it facilitates the development of strategies and solutions that uphold reliability while reducing power consumption. By treating the trade-off between power and reliability as a non-cooperative game, our proposed method yields significant energy savings, with up to a 35% reduction in energy consumption, 41% decrease in wait time, and 31% shorter completion time compared to state-of-the-art approaches. Our findings underscore the value of game theory in optimizing power and reliability within fog computing environments, demonstrating its potential for driving substantial improvements
△ Less
Submitted 15 December, 2024;
originally announced December 2024.
-
Uncovering EDK2 Firmware Flaws: Insights from Code Audit Tools
Authors:
Mahsa Farahani,
Ghazal Shenavar,
Ali Hosseinghorban,
Alireza Ejlali
Abstract:
Firmware serves as a foundational software layer in modern computers, initiating as the first code executed on platform hardware, similar in function to a minimal operating system. Defined as a software interface between an operating system and platform firmware, the Unified Extensible Firmware Interface (UEFI) standardizes system initialization and management. A prominent open-source implementati…
▽ More
Firmware serves as a foundational software layer in modern computers, initiating as the first code executed on platform hardware, similar in function to a minimal operating system. Defined as a software interface between an operating system and platform firmware, the Unified Extensible Firmware Interface (UEFI) standardizes system initialization and management. A prominent open-source implementation of UEFI, the EFI Development Kit II (EDK2), plays a crucial role in shaping firmware architecture. Despite its widespread adoption, the architecture faces challenges such as limited system resources at early stages and a lack of standard security features. Furthermore, the scarcity of open-source tools specifically designed for firmware analysis emphasizes the need for adaptable, innovative solutions.
In this paper, we explore the application of general code audit tools to firmware, with a particular focus on EDK2. Although these tools were not originally designed for firmware analysis, they have proven effective in identifying critical areas for enhancement in firmware security. Our findings, derived from deploying key audit tools on EDK2, categorize these tools based on their methodologies and illustrate their capability to uncover unique firmware attributes, significantly contributing to the understanding and improvement of firmware security.
△ Less
Submitted 22 September, 2024;
originally announced September 2024.
-
A Comprehensive Survey of Convolutions in Deep Learning: Applications, Challenges, and Future Trends
Authors:
Abolfazl Younesi,
Mohsen Ansari,
MohammadAmin Fazli,
Alireza Ejlali,
Muhammad Shafique,
Jörg Henkel
Abstract:
In today's digital age, Convolutional Neural Networks (CNNs), a subset of Deep Learning (DL), are widely used for various computer vision tasks such as image classification, object detection, and image segmentation. There are numerous types of CNNs designed to meet specific needs and requirements, including 1D, 2D, and 3D CNNs, as well as dilated, grouped, attention, depthwise convolutions, and NA…
▽ More
In today's digital age, Convolutional Neural Networks (CNNs), a subset of Deep Learning (DL), are widely used for various computer vision tasks such as image classification, object detection, and image segmentation. There are numerous types of CNNs designed to meet specific needs and requirements, including 1D, 2D, and 3D CNNs, as well as dilated, grouped, attention, depthwise convolutions, and NAS, among others. Each type of CNN has its unique structure and characteristics, making it suitable for specific tasks. It's crucial to gain a thorough understanding and perform a comparative analysis of these different CNN types to understand their strengths and weaknesses. Furthermore, studying the performance, limitations, and practical applications of each type of CNN can aid in the development of new and improved architectures in the future. We also dive into the platforms and frameworks that researchers utilize for their research or development from various perspectives. Additionally, we explore the main research fields of CNN like 6D vision, generative models, and meta-learning. This survey paper provides a comprehensive examination and comparison of various CNN architectures, highlighting their architectural differences and emphasizing their respective advantages, disadvantages, applications, challenges, and future trends.
△ Less
Submitted 28 February, 2024; v1 submitted 23 February, 2024;
originally announced February 2024.
-
Toward the Design of Fault-Tolerance- and Peak- Power-Aware Multi-Core Mixed-Criticality Systems
Authors:
Behnaz Ranjbar,
Ali Hosseinghorban,
Mohammad Salehi,
Alireza Ejlali,
Akash Kumar
Abstract:
Mixed-Criticality (MC) systems have recently been devised to address the requirements of real-time systems in industrial applications, where the system runs tasks with different criticality levels on a single platform. In some workloads, a high-critically task might overrun and overload the system, or a fault can occur during the execution. However, these systems must be fault-tolerant and guarant…
▽ More
Mixed-Criticality (MC) systems have recently been devised to address the requirements of real-time systems in industrial applications, where the system runs tasks with different criticality levels on a single platform. In some workloads, a high-critically task might overrun and overload the system, or a fault can occur during the execution. However, these systems must be fault-tolerant and guarantee the correct execution of all high-criticality tasks by their deadlines to avoid catastrophic consequences, in any situation. Furthermore, in these MC systems, the peak power consumption of the system may increase, especially in an overload situation and exceed the processor Thermal Design Power (TDP) constraint. This may cause generating heat beyond the cooling capacity, resulting the system stop to avoid excessive heat and halting the processor. In this paper, we propose a technique for dependent dual-criticality tasks in fault-tolerant multi-core MC systems to manage peak power consumption and temperature. The technique develops a tree of possible task mapping and scheduling at design-time to cover all possible scenarios and reduce the low-criticality task drop rate in the high-criticality mode. At run-time, the system exploits the tree to select a proper schedule according to fault occurrences and criticality mode changes. Experimental results show that the average task schedulability is 74.14% on average for the proposed method, while the peak power consumption and maximum temperature are improved by 16.65% and 14.9 C on average, respectively, compared to a recent work. In addition, for a real-life application, our method reduces the peak power and maximum temperature by up to 20.06% and 5 C, respectively, compared to a state-of-the-art approach.
△ Less
Submitted 31 May, 2021; v1 submitted 17 May, 2021;
originally announced May 2021.
-
Daemon computers versus clairvoyant computers: A pure theoretical viewpoint towards energy consumption of computing
Authors:
Alireza Ejlali
Abstract:
Energy consumption of computing has found increasing prominence but the area still suffers from the lack of a consolidated formal theory. In this paper, a theory for the energy consumption of computing is structured as an axiomatic system. The work is pure theoretical, involving theorem proving and mathematical reasoning. It is also interdisciplinary, so that while it targets computing, it involve…
▽ More
Energy consumption of computing has found increasing prominence but the area still suffers from the lack of a consolidated formal theory. In this paper, a theory for the energy consumption of computing is structured as an axiomatic system. The work is pure theoretical, involving theorem proving and mathematical reasoning. It is also interdisciplinary, so that while it targets computing, it involves theoretical physics (thermodynamics and statistical mechanics) and information theory. The theory does not contradict existing theories in theoretical physics and conforms to them as indeed it adopts its axioms from them. Nevertheless, the theory leads to interesting and important conclusions that have not been discussed in previous work. Some of them are: (i) Landauer's principle is shown to be a provable theorem provided that a precondition, named macroscopic determinism, holds. (ii) It is proved that real randomness (not pseudo randomness) can be used in computing in conjunction with or as an alternative to reversibility to achieve more energy saving. (iii) The theory propounds the concept that computers that use real randomness may apparently challenge the second law of thermodynamics. These are computational counterpart to Maxwell's daemon in thermodynamics and hence are named daemon computers. (iv) It is proved that if we do not accept the existence of daemon computers (to conform to the second law of thermodynamics), another type of computers, named clairvoyant computers, must exist that can gain information about other physical systems through real randomness. This theorem probably provides a theoretical explanation for strange observations about real randomness made in the global consciousness project at Princeton University.
△ Less
Submitted 12 November, 2020;
originally announced November 2020.
-
Power-Aware Run-Time Scheduler for Mixed-Criticality Systems on Multi-Core Platform
Authors:
Behnaz Ranjbar,
Tuan D. A. Nguyen,
Alireza Ejlali,
Akash Kumar
Abstract:
In modern multi-core Mixed-Criticality (MC) systems, a rise in peak power consumption due to parallel execution of tasks with maximum frequency, specially in the overload situation, may lead to thermal issues, which may affect the reliability and timeliness of MC systems. Therefore, managing peak power consumption has become imperative in multi-core MC systems. In this regard, we propose an online…
▽ More
In modern multi-core Mixed-Criticality (MC) systems, a rise in peak power consumption due to parallel execution of tasks with maximum frequency, specially in the overload situation, may lead to thermal issues, which may affect the reliability and timeliness of MC systems. Therefore, managing peak power consumption has become imperative in multi-core MC systems. In this regard, we propose an online peak power and thermal management heuristic for multi-core MC systems. This heuristic reduces the peak power consumption of the system as much as possible during runtime by exploiting dynamic slack and per-cluster Dynamic Voltage and Frequency Scaling (DVFS). Specifically, our approach examines multiple tasks ahead to determine the most appropriate one for slack assignment, that has the most impact on the system peak power and temperature. However, changing the frequency and selecting a proper task for slack assignment and a proper core for task re-mapping at runtime can be time-consuming and may cause deadline violation which is not admissible for high-criticality tasks. Therefore, we analyze and then optimize our run-time scheduler and evaluate it for various platforms. The proposed approach is experimentally validated on the ODROID-XU3 (DVFS-enabled heterogeneous multi-core platform) with various embedded real-time benchmarks. Results show that our heuristic achieves up to 5.25% reduction in system peak power and 20.33\% reduction in maximum temperature compared to an existing method while meeting deadline constraints in different criticality modes.
△ Less
Submitted 6 November, 2020;
originally announced November 2020.