Skip to main content

Showing 1–4 of 4 results for author: Van Delm, J

Searching in archive cs. Search in all archives.
.
  1. A Multi-level Compiler Backend for Accelerated Micro-kernels Targeting RISC-V ISA Extensions

    Authors: Alexandre Lopoukhine, Federico Ficarelli, Christos Vasiladiotis, Anton Lydike, Josse Van Delm, Alban Dutilleul, Luca Benini, Marian Verhelst, Tobias Grosser

    Abstract: High-performance micro-kernels must fully exploit today's diverse and specialized hardware to deliver peak performance to DNNs. While higher-level optimizations for DNNs are offered by numerous compilers (e.g., MLIR, TVM, OpenXLA), performance-critical micro-kernels are left to specialized code generators or handwritten assembly. Even though widely-adopted compilers (e.g., LLVM, GCC) offer tuned b… ▽ More

    Submitted 6 February, 2025; originally announced February 2025.

    ACM Class: D.3.4

  2. arXiv:2411.09543  [pdf, other

    cs.AR cs.AI

    OpenGeMM: A High-Utilization GeMM Accelerator Generator with Lightweight RISC-V Control and Tight Memory Coupling

    Authors: Xiaoling Yi, Ryan Antonio, Joren Dumoulin, Jiacong Sun, Josse Van Delm, Guilherme Paim, Marian Verhelst

    Abstract: Deep neural networks (DNNs) face significant challenges when deployed on resource-constrained extreme edge devices due to their computational and data-intensive nature. While standalone accelerators tailored for specific application scenarios suffer from inflexible control and limited programmability, generic hardware acceleration platforms coupled with RISC-V CPUs can enable high reusability and… ▽ More

    Submitted 21 November, 2024; v1 submitted 14 November, 2024; originally announced November 2024.

  3. arXiv:2410.08855  [pdf, other

    cs.DC cs.AI

    MATCH: Model-Aware TVM-based Compilation for Heterogeneous Edge Devices

    Authors: Mohamed Amine Hamdi, Francesco Daghero, Giuseppe Maria Sarda, Josse Van Delm, Arne Symons, Luca Benini, Marian Verhelst, Daniele Jahier Pagliari, Alessio Burrello

    Abstract: Streamlining the deployment of Deep Neural Networks (DNNs) on heterogeneous edge platforms, coupling within the same micro-controller unit (MCU) instruction processors and hardware accelerators for tensor computations, is becoming one of the crucial challenges of the TinyML field. The best-performing DNN compilation toolchains are usually deeply customized for a single MCU family, and porting to… ▽ More

    Submitted 11 October, 2024; originally announced October 2024.

    Comments: 13 pages, 11 figures, 4 tables

    ACM Class: I.2.2; D.1.3

  4. HTVM: Efficient Neural Network Deployment On Heterogeneous TinyML Platforms

    Authors: Josse Van Delm, Maarten Vandersteegen, Alessio Burrello, Giuseppe Maria Sarda, Francesco Conti, Daniele Jahier Pagliari, Luca Benini, Marian Verhelst

    Abstract: Optimal deployment of deep neural networks (DNNs) on state-of-the-art Systems-on-Chips (SoCs) is crucial for tiny machine learning (TinyML) at the edge. The complexity of these SoCs makes deployment non-trivial, as they typically contain multiple heterogeneous compute cores with limited, programmer-managed memory to optimize latency and energy efficiency. We propose HTVM - a compiler that merges T… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: Presented at DAC2023. Open-source code is available at https://github.com/KULeuven-MICAS/htvm

    ACM Class: D.3.4

    Journal ref: 2023 60th ACM/IEEE Design Automation Conference (DAC), San Francisco, CA, USA, 2023, pp. 1-6