-
MLPerf Power: Benchmarking the Energy Efficiency of Machine Learning Systems from Microwatts to Megawatts for Sustainable AI
Authors:
Arya Tschand,
Arun Tejusve Raghunath Rajan,
Sachin Idgunji,
Anirban Ghosh,
Jeremy Holleman,
Csaba Kiraly,
Pawan Ambalkar,
Ritika Borkar,
Ramesh Chukka,
Trevor Cockrell,
Oliver Curtis,
Grigori Fursin,
Miro Hodak,
Hiwot Kassa,
Anton Lokhmotov,
Dejan Miskovic,
Yuechao Pan,
Manu Prasad Manmathan,
Liz Raymond,
Tom St. John,
Arjun Suresh,
Rowan Taubitz,
Sean Zhan,
Scott Wasson,
David Kanter
, et al. (1 additional authors not shown)
Abstract:
Rapid adoption of machine learning (ML) technologies has led to a surge in power consumption across diverse systems, from tiny IoT devices to massive datacenter clusters. Benchmarking the energy efficiency of these systems is crucial for optimization, but presents novel challenges due to the variety of hardware platforms, workload characteristics, and system-level interactions. This paper introduc…
▽ More
Rapid adoption of machine learning (ML) technologies has led to a surge in power consumption across diverse systems, from tiny IoT devices to massive datacenter clusters. Benchmarking the energy efficiency of these systems is crucial for optimization, but presents novel challenges due to the variety of hardware platforms, workload characteristics, and system-level interactions. This paper introduces MLPerf Power, a comprehensive benchmarking methodology with capabilities to evaluate the energy efficiency of ML systems at power levels ranging from microwatts to megawatts. Developed by a consortium of industry professionals from more than 20 organizations, MLPerf Power establishes rules and best practices to ensure comparability across diverse architectures. We use representative workloads from the MLPerf benchmark suite to collect 1,841 reproducible measurements from 60 systems across the entire range of ML deployment scales. Our analysis reveals trade-offs between performance, complexity, and energy efficiency across this wide range of systems, providing actionable insights for designing optimized ML solutions from the smallest edge devices to the largest cloud infrastructures. This work emphasizes the importance of energy efficiency as a key metric in the evaluation and comparison of the ML system, laying the foundation for future research in this critical area. We discuss the implications for developing sustainable AI solutions and standardizing energy efficiency benchmarking for ML systems.
△ Less
Submitted 5 February, 2025; v1 submitted 15 October, 2024;
originally announced October 2024.
-
On the Design of Ethereum Data Availability Sampling: A Comprehensive Simulation Study
Authors:
Arunima Chaudhuri,
Sudipta Basak,
Csaba Kiraly,
Dmitriy Ryajov,
Leonardo Bautista-Gomez
Abstract:
This paper presents an in-depth exploration of Data Availability Sampling (DAS) and sharding mechanisms within decentralized systems through simulation-based analysis. DAS, a pivotal concept in blockchain technology and decentralized networks, is thoroughly examined to unravel its intricacies and assess its impact on system performance. Through the development of a simulator tailored explicitly fo…
▽ More
This paper presents an in-depth exploration of Data Availability Sampling (DAS) and sharding mechanisms within decentralized systems through simulation-based analysis. DAS, a pivotal concept in blockchain technology and decentralized networks, is thoroughly examined to unravel its intricacies and assess its impact on system performance. Through the development of a simulator tailored explicitly for DAS, we embark on a comprehensive investigation into the parameters that influence system behavior and efficiency. A series of experiments are conducted within the simulated environment to validate theoretical formulations and dissect the interplay of DAS parameters. This includes an exploration of approaches such as custody by row, variations in validators per node, and malicious nodes. The outcomes of these experiments furnish insights into the efficacy of DAS protocols and pave the way for the formulation of optimization strategies geared towards enhancing decentralized network performance. Moreover, the findings serve as guidelines for future research endeavors, offering a nuanced understanding of the complexities inherent in decentralized systems. This study not only contributes to the theoretical understanding of DAS but also offers practical implications for the design, implementation, and optimization of decentralized systems.
△ Less
Submitted 25 July, 2024;
originally announced July 2024.
-
Scalability limitations of Kademlia DHTs when enabling Data Availability Sampling in Ethereum
Authors:
Mikel Cortes-Goicoechea,
Csaba Kiraly,
Dmitriy Ryajov,
Jose Luis Muñoz-Tapia,
Leonardo Bautista-Gomez
Abstract:
Scalability in blockchain remains a significant challenge, especially when prioritizing decentralization and security. The Ethereum community has proposed comprehensive data-sharding techniques to overcome storage, computational, and network processing limitations. In this context, the propagation and availability of large blocks become the subject of research to achieve scalable data-sharding. Th…
▽ More
Scalability in blockchain remains a significant challenge, especially when prioritizing decentralization and security. The Ethereum community has proposed comprehensive data-sharding techniques to overcome storage, computational, and network processing limitations. In this context, the propagation and availability of large blocks become the subject of research to achieve scalable data-sharding. This paper provides insights after exploring the usage of a Kademlia-based DHT to enable Data Availability Sampling (DAS) in Ethereum. It presents a DAS-DHT simulator to study this problem and validates the results of the simulator with experiments in a real DHT network, IPFS. Our results help us understand what parts of DAS can be achieved based on existing Kademlia DHT solutions and which ones cannot. We discuss the limitations of DHT solutions and discuss other alternatives.
△ Less
Submitted 15 February, 2024;
originally announced February 2024.
-
Shortest Odd Paths in Undirected Graphs with Conservative Weight Functions
Authors:
Alpár Jüttner,
Csaba Király,
Lydia Mirabel Mendoza-Cadena,
Gyula Pap,
Ildikó Schlotter,
Yutaro Yamaguchi
Abstract:
We consider the Shortest Odd Path problem, where given an undirected graph $G$, a weight function on its edges, and two vertices $s$ and $t$ in $G$, the aim is to find an $(s,t)$-path with odd length and, among all such paths, of minimum weight. For the case when the weight function is conservative, i.e., when every cycle has non-negative total weight, the complexity of the Shortest Odd Path probl…
▽ More
We consider the Shortest Odd Path problem, where given an undirected graph $G$, a weight function on its edges, and two vertices $s$ and $t$ in $G$, the aim is to find an $(s,t)$-path with odd length and, among all such paths, of minimum weight. For the case when the weight function is conservative, i.e., when every cycle has non-negative total weight, the complexity of the Shortest Odd Path problem had been open for 20 years, and was recently shown to be NP-hard. We give a polynomial-time algorithm for the special case when the weight function is conservative and the set $E^-$ of negative-weight edges forms a single tree. Our algorithm exploits the strong connection between Shortest Odd Path and the problem of finding two internally vertex-disjoint paths between two terminals in an undirected edge-weighted graph. It also relies on solving an intermediary problem variant called Shortest Parity-Constrained Odd Path where for certain edges we have parity constraints on their position along the path. Also, we exhibit two FPT algorithms for solving Shortest Odd Path in graphs with conservative weight functions. The first FPT algorithm is parameterized by $|E^-|$, the number of negative edges, or more generally, by the maximum size of a matching in the subgraph of $G$ spanned by $E^-$. Our second FPT algorithm is parameterized by the treewidth of $G$.
△ Less
Submitted 24 August, 2023;
originally announced August 2023.
-
MLPerf Tiny Benchmark
Authors:
Colby Banbury,
Vijay Janapa Reddi,
Peter Torelli,
Jeremy Holleman,
Nat Jeffries,
Csaba Kiraly,
Pietro Montino,
David Kanter,
Sebastian Ahmed,
Danilo Pau,
Urmish Thakker,
Antonio Torrini,
Peter Warden,
Jay Cordaro,
Giuseppe Di Guglielmo,
Javier Duarte,
Stephen Gibellini,
Videet Parekh,
Honson Tran,
Nhan Tran,
Niu Wenxu,
Xu Xuesong
Abstract:
Advancements in ultra-low-power tiny machine learning (TinyML) systems promise to unlock an entirely new class of smart applications. However, continued progress is limited by the lack of a widely accepted and easily reproducible benchmark for these systems. To meet this need, we present MLPerf Tiny, the first industry-standard benchmark suite for ultra-low-power tiny machine learning systems. The…
▽ More
Advancements in ultra-low-power tiny machine learning (TinyML) systems promise to unlock an entirely new class of smart applications. However, continued progress is limited by the lack of a widely accepted and easily reproducible benchmark for these systems. To meet this need, we present MLPerf Tiny, the first industry-standard benchmark suite for ultra-low-power tiny machine learning systems. The benchmark suite is the collaborative effort of more than 50 organizations from industry and academia and reflects the needs of the community. MLPerf Tiny measures the accuracy, latency, and energy of machine learning inference to properly evaluate the tradeoffs between systems. Additionally, MLPerf Tiny implements a modular design that enables benchmark submitters to show the benefits of their product, regardless of where it falls on the ML deployment stack, in a fair and reproducible manner. The suite features four benchmarks: keyword spotting, visual wake words, image classification, and anomaly detection.
△ Less
Submitted 24 August, 2021; v1 submitted 14 June, 2021;
originally announced June 2021.
-
Offloading Execution from Edge to Cloud: a Dynamic Node-RED Based Approach
Authors:
Román Sosa,
Csaba Kiraly,
Juan D. Parra Rodriguez
Abstract:
Fog computing enables use cases where data produced in end devices are stored, processed, and acted on directly at the edges of the network, yet computation can be offloaded to more powerful instances through the edge to cloud continuum. Such offloading mechanism is especially needed in case of modern multi-purpose IoT gateways, where both demand and operation conditions can vary largely between d…
▽ More
Fog computing enables use cases where data produced in end devices are stored, processed, and acted on directly at the edges of the network, yet computation can be offloaded to more powerful instances through the edge to cloud continuum. Such offloading mechanism is especially needed in case of modern multi-purpose IoT gateways, where both demand and operation conditions can vary largely between deployments. To facilitate the development and operations of gateways, we implement offloading directly as part of the IoT rapid prototyping process embedded in the software stack, based on Node-RED. We evaluate the implemented method using an image processing example, and compare various offloading strategies based on resource consumption and other system metrics, highlighting the differences in handling demand and service levels reached.
△ Less
Submitted 26 October, 2018;
originally announced October 2018.
-
Towards Multi-container Deployment on IoT Gateways
Authors:
Koustabh Dolui,
Csaba Kiraly
Abstract:
Stringent latency requirements in advanced Internet of Things (IoT) applications as well as an increased load on cloud data centers have prompted a move towards a more decentralized approach, bringing storage and processing of IoT data closer to the end-devices through the deployment of multi-purpose IoT gateways. However, the resource constrained nature and diversity of these gateways pose a chal…
▽ More
Stringent latency requirements in advanced Internet of Things (IoT) applications as well as an increased load on cloud data centers have prompted a move towards a more decentralized approach, bringing storage and processing of IoT data closer to the end-devices through the deployment of multi-purpose IoT gateways. However, the resource constrained nature and diversity of these gateways pose a challenge in developing applications that can be deployed widely. This challenge can be overcome with containerization, a form of lightweight virtualization, bringing support for a wide range of hardware architectures and operating system agnostic deployment of applications on IoT gateways. This paper discusses the architectural aspects of containerization, and studies the suitability of available containerization tools for multi-container deployment in the context of IoT gateways. We present containerization in the context of AGILE, a multi-container and micro-service based open source framework for IoT gateways, developed as part of a Horizon 2020 project. Our study of containerized services to perform common gateway functions like device discovery, data management and cloud integration among others, reveal the advantages of having a containerized environment for IoT gateways with regard to use of base image hierarchies and image layering for in-container and cross-container performance optimizations. We illustrate these results in a set of benchmark experiments in this paper.
△ Less
Submitted 4 October, 2018;
originally announced October 2018.