Skip to main content

Showing 1–11 of 11 results for author: Lokhmotov, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2410.12032  [pdf, other

    cs.AR cs.DC cs.LG

    MLPerf Power: Benchmarking the Energy Efficiency of Machine Learning Systems from Microwatts to Megawatts for Sustainable AI

    Authors: Arya Tschand, Arun Tejusve Raghunath Rajan, Sachin Idgunji, Anirban Ghosh, Jeremy Holleman, Csaba Kiraly, Pawan Ambalkar, Ritika Borkar, Ramesh Chukka, Trevor Cockrell, Oliver Curtis, Grigori Fursin, Miro Hodak, Hiwot Kassa, Anton Lokhmotov, Dejan Miskovic, Yuechao Pan, Manu Prasad Manmathan, Liz Raymond, Tom St. John, Arjun Suresh, Rowan Taubitz, Sean Zhan, Scott Wasson, David Kanter , et al. (1 additional authors not shown)

    Abstract: Rapid adoption of machine learning (ML) technologies has led to a surge in power consumption across diverse systems, from tiny IoT devices to massive datacenter clusters. Benchmarking the energy efficiency of these systems is crucial for optimization, but presents novel challenges due to the variety of hardware platforms, workload characteristics, and system-level interactions. This paper introduc… ▽ More

    Submitted 5 February, 2025; v1 submitted 15 October, 2024; originally announced October 2024.

    Comments: 16 pages, 11 figures, 1 table

  2. arXiv:2105.13279  [pdf, other

    cs.CV cs.DC

    Dynamic Network selection for the Object Detection task: why it matters and what we (didn't) achieve

    Authors: Emanuele Vitali, Anton Lokhmotov, Gianluca Palermo

    Abstract: In this paper, we want to show the potential benefit of a dynamic auto-tuning approach for the inference process in the Deep Neural Network (DNN) context, tackling the object detection challenge. We benchmarked different neural networks to find the optimal detector for the well-known COCO 17 database, and we demonstrate that even if we only consider the quality of the prediction there is not a sin… ▽ More

    Submitted 27 May, 2021; originally announced May 2021.

    Comments: Paper accepted at SAMOS21 - International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation

  3. arXiv:2003.04821  [pdf, other

    cs.PF cs.LG

    Benchmarking TinyML Systems: Challenges and Direction

    Authors: Colby R. Banbury, Vijay Janapa Reddi, Max Lam, William Fu, Amin Fazel, Jeremy Holleman, Xinyuan Huang, Robert Hurtado, David Kanter, Anton Lokhmotov, David Patterson, Danilo Pau, Jae-sun Seo, Jeff Sieracki, Urmish Thakker, Marian Verhelst, Poonam Yadav

    Abstract: Recent advancements in ultra-low-power machine learning (TinyML) hardware promises to unlock an entirely new class of smart applications. However, continued progress is limited by the lack of a widely accepted benchmark for these systems. Benchmarking allows us to measure and thereby systematically compare, evaluate, and improve the performance of systems and is therefore fundamental to a field re… ▽ More

    Submitted 29 January, 2021; v1 submitted 10 March, 2020; originally announced March 2020.

    Comments: 6 pages, 1 figure, 3 tables

  4. arXiv:1911.02549  [pdf, other

    cs.LG cs.PF stat.ML

    MLPerf Inference Benchmark

    Authors: Vijay Janapa Reddi, Christine Cheng, David Kanter, Peter Mattson, Guenther Schmuelling, Carole-Jean Wu, Brian Anderson, Maximilien Breughe, Mark Charlebois, William Chou, Ramesh Chukka, Cody Coleman, Sam Davis, Pan Deng, Greg Diamos, Jared Duke, Dave Fick, J. Scott Gardner, Itay Hubara, Sachin Idgunji, Thomas B. Jablin, Jeff Jiao, Tom St. John, Pankaj Kanwar, David Lee , et al. (22 additional authors not shown)

    Abstract: Machine-learning (ML) hardware and software system demand is burgeoning. Driven by ML applications, the number of different ML inference systems has exploded. Over 100 organizations are building ML inference chips, and the systems that incorporate existing models span at least three orders of magnitude in power consumption and five orders of magnitude in performance; they range from embedded devic… ▽ More

    Submitted 9 May, 2020; v1 submitted 6 November, 2019; originally announced November 2019.

    Comments: ISCA 2020

  5. arXiv:1806.07060  [pdf, other

    cs.PF cs.DC cs.MS cs.SE

    A model-driven approach for a new generation of adaptive libraries

    Authors: Marco Cianfriglia, Flavio Vella, Cedric Nugteren, Anton Lokhmotov, Grigori Fursin

    Abstract: Efficient high-performance libraries often expose multiple tunable parameters to provide highly optimized routines. These can range from simple loop unroll factors or vector sizes all the way to algorithmic changes, given that some implementations can be more suitable for certain devices by exploiting hardware characteristics such as local memories and vector units. Traditionally, such parameters… ▽ More

    Submitted 19 June, 2018; originally announced June 2018.

    Comments: New detailed analysis will be provided

    Report number: Volume 18 Issue 1 Pages 1-24

    Journal ref: ACM Transactions on Architecture and Code Optimization 2021

  6. arXiv:1801.08024  [pdf, other

    cs.HC cs.CY

    A Collective Knowledge workflow for collaborative research into multi-objective autotuning and machine learning techniques

    Authors: Grigori Fursin, Anton Lokhmotov, Dmitry Savenko, Eben Upton

    Abstract: Developing efficient software and hardware has never been harder whether it is for a tiny IoT device or an Exascale supercomputer. Apart from the ever growing design and optimization complexity, there exist even more fundamental problems such as lack of interdisciplinary knowledge required for effective software/hardware co-design, and a growing technology transfer gap between academia and industr… ▽ More

    Submitted 19 January, 2018; originally announced January 2018.

    Comments: Interactive CK report: http://cKnowledge.org/rpi-crowd-tuning ; CK repository with artifacts: https://github.com/ctuning/ck-rpi-optimization-results ; FigShare data archive: https://doi.org/10.6084/m9.figshare.5789007.v2

  7. arXiv:1801.06378  [pdf, other

    stat.ML cs.LG cs.SE

    Introducing ReQuEST: an Open Platform for Reproducible and Quality-Efficient Systems-ML Tournaments

    Authors: Thierry Moreau, Anton Lokhmotov, Grigori Fursin

    Abstract: Co-designing efficient machine learning based systems across the whole hardware/software stack to trade off speed, accuracy, energy and costs is becoming extremely complex and time consuming. Researchers often struggle to evaluate and compare different published works across rapidly evolving software frameworks, heterogeneous hardware platforms, compilers, libraries, algorithms, data sets, models,… ▽ More

    Submitted 19 January, 2018; originally announced January 2018.

    Comments: ReQuEST tournament website: http://cKnowledge.org/request

  8. arXiv:1511.03742  [pdf, other

    cs.MS cs.PF

    GEMMbench: a framework for reproducible and collaborative benchmarking of matrix multiplication

    Authors: Anton Lokhmotov

    Abstract: The generic matrix-matrix multiplication (GEMM) is arguably the most popular computational kernel of the 20th century. Yet, surprisingly, no common methodology for evaluating GEMM performance has been established over the many decades of using GEMM for comparing architectures, compilers and ninja-class programmers. We introduce GEMMbench, a framework and methodology for evaluating performance of… ▽ More

    Submitted 17 November, 2015; v1 submitted 11 November, 2015; originally announced November 2015.

    Comments: ADAPT'16

  9. arXiv:1506.06256  [pdf, other

    cs.SE cs.LG cs.PF

    Collective Mind, Part II: Towards Performance- and Cost-Aware Software Engineering as a Natural Science

    Authors: Grigori Fursin, Abdul Memon, Christophe Guillon, Anton Lokhmotov

    Abstract: Nowadays, engineers have to develop software often without even knowing which hardware it will eventually run on in numerous mobile phones, tablets, desktops, laptops, data centers, supercomputers and cloud services. Unfortunately, optimizing compilers are not keeping pace with ever increasing complexity of computer systems anymore and may produce severely underperforming executable codes while wa… ▽ More

    Submitted 20 June, 2015; originally announced June 2015.

    Comments: Presented at the 18th International Workshop on Compilers for Parallel Computing (CPC'15), London, UK

  10. arXiv:1502.07241   

    cs.AR cs.CV cs.DC

    Proceedings of the DATE Friday Workshop on Heterogeneous Architectures and Design Methods for Embedded Image Systems (HIS 2015)

    Authors: Frank Hannig, Dietmar Fey, Anton Lokhmotov

    Abstract: This volume contains the papers accepted at the DATE Friday Workshop on Heterogeneous Architectures and Design Methods for Embedded Image Systems (HIS 2015), held in Grenoble, France, March 13, 2015. HIS 2015 was co-located with the Conference on Design, Automation and Test in Europe (DATE).

    Submitted 26 February, 2015; v1 submitted 25 February, 2015; originally announced February 2015.

    Comments: Website of the workshop: https://www12.cs.fau.de/ws/his2015/

  11. arXiv:1302.5586  [pdf, other

    cs.PL cs.DC

    PENCIL: Towards a Platform-Neutral Compute Intermediate Language for DSLs

    Authors: Riyadh Baghdadi, Albert Cohen, Serge Guelton, Sven Verdoolaege, Jun Inoue, Tobias Grosser, Georgia Kouveli, Alexey Kravets, Anton Lokhmotov, Cedric Nugteren, Fraser Waters, Alastair F. Donaldson

    Abstract: We motivate the design and implementation of a platform-neutral compute intermediate language (PENCIL) for productive and performance-portable accelerator programming.

    Submitted 22 February, 2013; originally announced February 2013.