Skip to main content

Showing 1–10 of 10 results for author: Tortorella, Y

.
  1. RedMulE-FT: A Reconfigurable Fault-Tolerant Matrix Multiplication Engine

    Authors: Philip Wiese, Maurus Item, Luca Bertaccini, Yvan Tortorella, Angelo Garofalo, Luca Benini

    Abstract: As safety-critical applications increasingly rely on data-parallel floating-point computations, there is an increasing need for flexible and configurable fault tolerance in parallel floating-point accelerators such as tensor engines. While replication-based methods ensure reliability but incur high area and power costs, error correction codes lack the flexibility to trade off robustness against pe… ▽ More

    Submitted 19 April, 2025; originally announced April 2025.

    Comments: Accepted to CF25-OSHW: Workshop on Open-Source Hardware (3rd Edition), co-located with Computing Frontiers 2025

    ACM Class: B.7.3; B.8.1

  2. arXiv:2503.04581  [pdf, other

    cs.AR eess.SY

    Maestro: A 302 GFLOPS/W and 19.8GFLOPS RISC-V Vector-Tensor Architecture for Wearable Ultrasound Edge Computing

    Authors: Mattia Sinigaglia, Amirhossein Kiamarzi, Marco Bertuletti, Luigi Ghionda, Mattia Orlandi, Riccardo Tedeschi, Aurora Di Giampietro, Yvan Tortorella, Luca Bertaccini, Simone Benatti, Giuseppe Tagliavini, Luca Benini, Francesco Conti, Davide Rossi

    Abstract: Most Wearable Ultrasound (WUS) devices lack the computational power to process signals at the edge, instead relying on remote offload, which introduces latency, high power consumption, and privacy concerns. We present Maestro, a RISC-V SoC with unified Vector-Tensor Unit (VTU) and memory-coupled Fast Fourier Transform (FFT) accelerators targeting edge processing for wearable ultrasound devices, fa… ▽ More

    Submitted 6 March, 2025; originally announced March 2025.

  3. arXiv:2502.18953  [pdf, other

    cs.AR cs.DC

    A Reliable, Time-Predictable Heterogeneous SoC for AI-Enhanced Mixed-Criticality Edge Applications

    Authors: Angelo Garofalo, Alessandro Ottaviano, Matteo Perotti, Thomas Benz, Yvan Tortorella, Robert Balas, Michael Rogenmoser, Chi Zhang, Luca Bertaccini, Nils Wistoff, Maicol Ciani, Cyril Koenig, Mattia Sinigaglia, Luca Valente, Paul Scheffler, Manuel Eggimann, Matheus Cavalcante, Francesco Restuccia, Alessandro Biondi, Francesco Conti, Frank K. Gurkaynak, Davide Rossi, Luca Benini

    Abstract: Next-generation mixed-criticality Systems-on-chip (SoCs) for robotics, automotive, and space must execute mixed-criticality AI-enhanced sensor processing and control workloads, ensuring reliable and time-predictable execution of critical tasks sharing resources with non-critical tasks, while also fitting within a sub-2W power envelope. To tackle these multi-dimensional challenges, in this brief, w… ▽ More

    Submitted 26 February, 2025; originally announced February 2025.

  4. arXiv:2412.06321  [pdf, other

    cs.AR

    A Flexible Template for Edge Generative AI with High-Accuracy Accelerated Softmax & GELU

    Authors: Andrea Belano, Yvan Tortorella, Angelo Garofalo, Luca Benini, Davide Rossi, Francesco Conti

    Abstract: Transformer-based generative Artificial Intelligence (GenAI) models achieve remarkable results in a wide range of fields, including natural language processing, computer vision, and audio processing. However, this comes at the cost of increased complexity and the need of sophisticated non-linearities such as softmax and GELU. Even if Transformers are computationally dominated by matrix multiplicat… ▽ More

    Submitted 9 December, 2024; originally announced December 2024.

  5. A Heterogeneous RISC-V based SoC for Secure Nano-UAV Navigation

    Authors: Luca Valente, Alessandro Nadalini, Asif Veeran, Mattia Sinigaglia, Bruno Sa, Nils Wistoff, Yvan Tortorella, Simone Benatti, Rafail Psiakis, Ari Kulmala, Baker Mohammad, Sandro Pinto, Daniele Palossi, Luca Benini, Davide Rossi

    Abstract: The rapid advancement of energy-efficient parallel ultra-low-power (ULP) ucontrollers units (MCUs) is enabling the development of autonomous nano-sized unmanned aerial vehicles (nano-UAVs). These sub-10cm drones represent the next generation of unobtrusive robotic helpers and ubiquitous smart sensors. However, nano-UAVs face significant power and payload constraints while requiring advanced comput… ▽ More

    Submitted 7 January, 2024; originally announced January 2024.

  6. DARKSIDE: A Heterogeneous RISC-V Compute Cluster for Extreme-Edge On-Chip DNN Inference and Training

    Authors: Angelo Garofalo, Yvan Tortorella, Matteo Perotti, Luca Valente, Alessandro Nadalini, Luca Benini, Davide Rossi, Francesco Conti

    Abstract: On-chip DNN inference and training at the Extreme-Edge (TinyML) impose strict latency, throughput, accuracy and flexibility requirements. Heterogeneous clusters are promising solutions to meet the challenge, combining the flexibility of DSP-enhanced cores with the performance and energy boost of dedicated accelerators. We present DARKSIDE, a System-on-Chip with a heterogeneous cluster of 8 RISC-V… ▽ More

    Submitted 31 March, 2023; originally announced March 2023.

    Comments: 11 pages, 15 figures

  7. arXiv:2303.08706  [pdf, other

    eess.SY cs.AR

    Hybrid Modular Redundancy: Exploring Modular Redundancy Approaches in RISC-V Multi-Core Computing Clusters for Reliable Processing in Space

    Authors: Michael Rogenmoser, Yvan Tortorella, Davide Rossi, Francesco Conti, Luca Benini

    Abstract: Space Cyber-Physical Systems (S-CPS) such as spacecraft and satellites strongly rely on the reliability of onboard computers to guarantee the success of their missions. Relying solely on radiation-hardened technologies is extremely expensive, and developing inflexible architectural and microarchitectural modifications to introduce modular redundancy within a system leads to significant area increa… ▽ More

    Submitted 14 November, 2023; v1 submitted 15 March, 2023; originally announced March 2023.

  8. arXiv:2301.03904  [pdf, other

    cs.AR cs.AI cs.LG

    RedMule: A Mixed-Precision Matrix-Matrix Operation Engine for Flexible and Energy-Efficient On-Chip Linear Algebra and TinyML Training Acceleration

    Authors: Yvan Tortorella, Luca Bertaccini, Luca Benini, Davide Rossi, Francesco Conti

    Abstract: The increasing interest in TinyML, i.e., near-sensor machine learning on power budgets of a few tens of mW, is currently pushing toward enabling TinyML-class training as opposed to inference only. Current training algorithms, based on various forms of error and gradient backpropagation, rely on floating-point matrix operations to meet the precision and dynamic range requirements. So far, the energ… ▽ More

    Submitted 6 May, 2023; v1 submitted 10 January, 2023; originally announced January 2023.

  9. HULK-V: a Heterogeneous Ultra-low-power Linux capable RISC-V SoC

    Authors: Luca Valente, Yvan Tortorella, Mattia Sinigaglia, Giuseppe Tagliavini, Alessandro Capotondi, Luca Benini, Davide Rossi

    Abstract: IoT applications span a wide range in performance and memory footprint, under tight cost and power constraints. High-end applications rely on power-hungry Systems-on-Chip (SoCs) featuring powerful processors, large LPDDR/DDR3/4/5 memories, and supporting full-fledged Operating Systems (OS). On the contrary, low-end applications typically rely on Ultra-Low-Power ucontrollers with a "close to metal"… ▽ More

    Submitted 27 November, 2022; originally announced November 2022.

    Comments: This paper has been accepted as full paper at DATE23 https://www.date-conference.com/date-2023-accepted-papers#Regular-Papers

  10. arXiv:2204.11192  [pdf, other

    cs.AR

    RedMulE: A Compact FP16 Matrix-Multiplication Accelerator for Adaptive Deep Learning on RISC-V-Based Ultra-Low-Power SoCs

    Authors: Yvan Tortorella, Luca Bertaccini, Davide Rossi, Luca Benini, Francesco Conti

    Abstract: The fast proliferation of extreme-edge applications using Deep Learning (DL) based algorithms required dedicated hardware to satisfy extreme-edge applications' latency, throughput, and precision requirements. While inference is achievable in practical cases, online finetuning and adaptation of general DL models are still highly challenging. One of the key stumbling stones is the need for parallel… ▽ More

    Submitted 24 April, 2022; originally announced April 2022.