Skip to main content

Showing 1–17 of 17 results for author: Wistoff, N

.
  1. Ramping Up Open-Source RISC-V Cores: Assessing the Energy Efficiency of Superscalar, Out-of-Order Execution

    Authors: Zexin Fu, Riccardo Tedeschi, Gianmarco Ottavi, Nils Wistoff, César Fuguet, Davide Rossi, Luca Benini

    Abstract: Open-source RISC-V cores are increasingly demanded in domains like automotive and space, where achieving high instructions per cycle (IPC) through superscalar and out-of-order (OoO) execution is crucial. However, high-performance open-source RISC-V cores face adoption challenges: some (e.g. BOOM, Xiangshan) are developed in Chisel with limited support from industrial electronic design automation (… ▽ More

    Submitted 30 May, 2025; originally announced May 2025.

  2. arXiv:2505.03762  [pdf, other

    cs.AR

    CVA6S+: A Superscalar RISC-V Core with High-Throughput Memory Architecture

    Authors: Riccardo Tedeschi, Gianmarco Ottavi, Côme Allart, Nils Wistoff, Zexin Fu, Filippo Grillotti, Fabio De Ambroggi, Elio Guidetti, Jean-Baptiste Rigaud, Olivier Potin, Jean Roch Coulon, César Fuguet, Luca Benini, Davide Rossi

    Abstract: Open-source RISC-V cores are increasingly adopted in high-end embedded domains such as automotive, where maximizing instructions per cycle (IPC) is becoming critical. Building on the industry-supported open-source CVA6 core and its superscalar variant, CVA6S, we introduce CVA6S+, an enhanced version incorporating improved branch prediction, register renaming and enhanced operand forwarding. These… ▽ More

    Submitted 8 May, 2025; v1 submitted 20 April, 2025; originally announced May 2025.

    Comments: 3 pages, 1 figure

  3. AraOS: Analyzing the Impact of Virtual Memory Management on Vector Unit Performance

    Authors: Matteo Perotti, Vincenzo Maisto, Moritz Imfeld, Nils Wistoff, Alessandro Cilardo, Luca Benini

    Abstract: Vector processor architectures offer an efficient solution for accelerating data-parallel workloads (e.g., ML, AI), reducing instruction count, and enhancing processing efficiency. This is evidenced by the increasing adoption of vector ISAs, such as Arm's SVE/SVE2 and RISC-V's RVV, not only in high-performance computers but also in embedded systems. The open-source nature of RVV has particularly e… ▽ More

    Submitted 14 April, 2025; originally announced April 2025.

    Comments: Submitted to CF25-OSHW: Workshop on Open-Source Hardware (3rd Edition), co-located with Computing Frontiers 2025

  4. arXiv:2502.18953  [pdf, other

    cs.AR cs.DC

    A Reliable, Time-Predictable Heterogeneous SoC for AI-Enhanced Mixed-Criticality Edge Applications

    Authors: Angelo Garofalo, Alessandro Ottaviano, Matteo Perotti, Thomas Benz, Yvan Tortorella, Robert Balas, Michael Rogenmoser, Chi Zhang, Luca Bertaccini, Nils Wistoff, Maicol Ciani, Cyril Koenig, Mattia Sinigaglia, Luca Valente, Paul Scheffler, Manuel Eggimann, Matheus Cavalcante, Francesco Restuccia, Alessandro Biondi, Francesco Conti, Frank K. Gurkaynak, Davide Rossi, Luca Benini

    Abstract: Next-generation mixed-criticality Systems-on-chip (SoCs) for robotics, automotive, and space must execute mixed-criticality AI-enhanced sensor processing and control workloads, ensuring reliable and time-predictable execution of critical tasks sharing resources with non-critical tasks, while also fitting within a sub-2W power envelope. To tackle these multi-dimensional challenges, in this brief, w… ▽ More

    Submitted 26 February, 2025; originally announced February 2025.

  5. arXiv:2502.02626  [pdf, other

    cs.OH

    ArtistIC: An Open-Source Toolchain for Top-Metal IC Art and Ultra-High-Fidelity GDSII Renders

    Authors: Thomas Benz, Paul Scheffler, Nils Wistoff, Philippe Sauter, Beat Muheim, Luca Benini

    Abstract: Open-source projects require outreach material to grow their community, secure funds, and strengthen their influence. Numbers, specifications, and facts alone are intangible to uninvolved people; using a clear brand and appealing visual material is thus ample to reach a broad audience. This is especially true for application-specific integrated circuits (ASICs) during the early stages of the devel… ▽ More

    Submitted 4 February, 2025; originally announced February 2025.

    Comments: 2 pages, 3 figures

  6. Occamy: A 432-Core Dual-Chiplet Dual-HBM2E 768-DP-GFLOP/s RISC-V System for 8-to-64-bit Dense and Sparse Computing in 12nm FinFET

    Authors: Paul Scheffler, Thomas Benz, Viviane Potocnik, Tim Fischer, Luca Colagrande, Nils Wistoff, Yichao Zhang, Luca Bertaccini, Gianmarco Ottavi, Manuel Eggimann, Matheus Cavalcante, Gianna Paulin, Frank K. Gürkaynak, Davide Rossi, Luca Benini

    Abstract: ML and HPC applications increasingly combine dense and sparse memory access computations to maximize storage efficiency. However, existing CPUs and GPUs struggle to flexibly handle these heterogeneous workloads with consistently high compute efficiency. We present Occamy, a 432-Core, 768-DP-GFLOP/s, dual-HBM2E, dual-chiplet RISC-V system with a latency-tolerant hierarchical interconnect and in-cor… ▽ More

    Submitted 13 January, 2025; originally announced January 2025.

    Comments: 16 pages, 13 figures, 1 table. Accepted for publication in IEEE JSSC

  7. arXiv:2410.07798  [pdf, other

    cs.AR

    vCLIC: Towards Fast Interrupt Handling in Virtualized RISC-V Mixed-criticality Systems

    Authors: Enrico Zelioli, Alessandro Ottaviano, Robert Balas, Nils Wistoff, Angelo Garofalo, Luca Benini

    Abstract: The widespread diffusion of compute-intensive edge-AI workloads and the stringent demands of modern autonomous systems require advanced heterogeneous embedded architectures. Such architectures must support high-performance and reliable execution of parallel tasks with different levels of criticality. Hardware-assisted virtualization is crucial for isolating applications concurrently executing thes… ▽ More

    Submitted 10 October, 2024; originally announced October 2024.

    Comments: 4 pages, 4 figures, accepted for presentation at the 42nd IEEE International Conference on Computer Design (ICCD 2024)

  8. arXiv:2409.07576  [pdf, other

    cs.CR

    fence.t.s: Closing Timing Channels in High-Performance Out-of-Order Cores through ISA-Supported Temporal Partitioning

    Authors: Nils Wistoff, Gernot Heiser, Luca Benini

    Abstract: Microarchitectural timing channels exploit information leakage between security domains that should be isolated, bypassing the operating system's security boundaries. These channels result from contention for shared microarchitectural state. In the RISC-V instruction set, the temporal fence instruction (fence.t) was proposed to close timing channels by providing an operating system with the means… ▽ More

    Submitted 11 September, 2024; originally announced September 2024.

    Comments: 8 pages, 3 figures, 1 algorithm, 1 listing. Accepted at the 2024 International Conference on Applications in Electronics Pervading Industry, Environment and Society (APPLEPIES 2024)

  9. arXiv:2407.19895  [pdf

    eess.SY

    Culsans: An Efficient Snoop-based Coherency Unit for the CVA6 Open Source RISC-V application processor

    Authors: Riccardo Tedeschi, Luca Valente, Gianmarco Ottavi, Enrico Zelioli, Nils Wistoff, Massimiliano Giacometti, Abdul Basit Sajjad, Luca Benini, Davide Rossi

    Abstract: Symmetric Multi-Processing (SMP) based on cache coherency is crucial for high-end embedded systems like automotive applications. RISC-V is gaining traction, and open-source hardware (OSH) platforms offer solutions to issues such as IP costs and vendor dependency. Existing multi-core cache-coherent RISC-V platforms are complex and not efficient for small embedded core clusters. We propose an open-s… ▽ More

    Submitted 12 August, 2024; v1 submitted 29 July, 2024; originally announced July 2024.

    Comments: 4 pages, 4 figures, DSD2024 and SEAA2024 Works in Progress Session AUG 2024; Updated the acknowledgments

  10. arXiv:2406.15068  [pdf, other

    cs.AR

    Occamy: A 432-Core 28.1 DP-GFLOP/s/W 83% FPU Utilization Dual-Chiplet, Dual-HBM2E RISC-V-based Accelerator for Stencil and Sparse Linear Algebra Computations with 8-to-64-bit Floating-Point Support in 12nm FinFET

    Authors: Gianna Paulin, Paul Scheffler, Thomas Benz, Matheus Cavalcante, Tim Fischer, Manuel Eggimann, Yichao Zhang, Nils Wistoff, Luca Bertaccini, Luca Colagrande, Gianmarco Ottavi, Frank K. Gürkaynak, Davide Rossi, Luca Benini

    Abstract: We present Occamy, a 432-core RISC-V dual-chiplet 2.5D system for efficient sparse linear algebra and stencil computations on FP64 and narrow (32-, 16-, 8-bit) SIMD FP data. Occamy features 48 clusters of RISC-V cores with custom extensions, two 64-bit host cores, and a latency-tolerant multi-chiplet interconnect and memory system with 32 GiB of HBM2E. It achieves leading-edge utilization on stenc… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: 2 pages, 7 figures. Accepted at the 2024 IEEE Symposium on VLSI Technology & Circuits

  11. A Heterogeneous RISC-V based SoC for Secure Nano-UAV Navigation

    Authors: Luca Valente, Alessandro Nadalini, Asif Veeran, Mattia Sinigaglia, Bruno Sa, Nils Wistoff, Yvan Tortorella, Simone Benatti, Rafail Psiakis, Ari Kulmala, Baker Mohammad, Sandro Pinto, Daniele Palossi, Luca Benini, Davide Rossi

    Abstract: The rapid advancement of energy-efficient parallel ultra-low-power (ULP) ucontrollers units (MCUs) is enabling the development of autonomous nano-sized unmanned aerial vehicles (nano-UAVs). These sub-10cm drones represent the next generation of unobtrusive robotic helpers and ubiquitous smart sensors. However, nano-UAVs face significant power and payload constraints while requiring advanced comput… ▽ More

    Submitted 7 January, 2024; originally announced January 2024.

  12. arXiv:2310.17046  [pdf, other

    cs.OS cs.CR cs.LO

    Proving the Absence of Microarchitectural Timing Channels

    Authors: Scott Buckley, Robert Sison, Nils Wistoff, Curtis Millar, Toby Murray, Gerwin Klein, Gernot Heiser

    Abstract: Microarchitectural timing channels are a major threat to computer security. A set of OS mechanisms called time protection was recently proposed as a principled way of preventing information leakage through such channels and prototyped in the seL4 microkernel. We formalise time protection and the underlying hardware mechanisms in a way that allows linking them to the information-flow proofs that sh… ▽ More

    Submitted 25 October, 2023; originally announced October 2023.

    Comments: Scott Buckley and Robert Sison were joint lead authors

    ACM Class: D.4.6; D.2.4; F.3.1

  13. Towards a RISC-V Open Platform for Next-generation Automotive ECUs

    Authors: Luca Cuomo, Claudio Scordino, Alessandro Ottaviano, Nils Wistoff, Robert Balas, Luca Benini, Errico Guidieri, Ida Maria Savino

    Abstract: The complexity of automotive systems is increasing quickly due to the integration of novel functionalities such as assisted or autonomous driving. However, increasing complexity poses considerable challenges to the automotive supply chain since the continuous addition of new hardware and network cabling is not considered tenable. The availability of modern heterogeneous multi-processor chips repre… ▽ More

    Submitted 9 July, 2023; originally announced July 2023.

    Comments: 8 pages, 2023 12th Mediterranean Conference on Embedded Computing (MECO)

    Journal ref: 2023 12th Mediterranean Conference on Embedded Computing (MECO), Budva, Montenegro, 2023, pp. 1-8

  14. A "New Ara" for Vector Computing: An Open Source Highly Efficient RISC-V V 1.0 Vector Processor Design

    Authors: Matteo Perotti, Matheus Cavalcante, Nils Wistoff, Renzo Andri, Lukas Cavigelli, Luca Benini

    Abstract: Vector architectures are gaining traction for highly efficient processing of data-parallel workloads, driven by all major ISAs (RISC-V, Arm, Intel), and boosted by landmark chips, like the Arm SVE-based Fujitsu A64FX, powering the TOP500 leader Fugaku. The RISC-V V extension has recently reached 1.0-Frozen status. Here, we present its first open-source implementation, discuss the new specification… ▽ More

    Submitted 9 January, 2025; v1 submitted 17 October, 2022; originally announced October 2022.

    Comments: Accepted version of the article published in "2022 IEEE 33rd International Conference on Application-specific Systems, Architectures and Processors (ASAP)"

  15. On-Demand Redundancy Grouping: Selectable Soft-Error Tolerance for a Multicore Cluster

    Authors: Michael Rogenmoser, Nils Wistoff, Pirmin Vogel, Frank Gürkaynak, Luca Benini

    Abstract: With the shrinking of technology nodes and the use of parallel processor clusters in hostile and critical environments, such as space, run-time faults caused by radiation are a serious cross-cutting concern, also impacting architectural design. This paper introduces an architectural approach to run-time configurable soft-error tolerance at the core level, augmenting a six-core open-source RISC-V c… ▽ More

    Submitted 3 October, 2023; v1 submitted 25 May, 2022; originally announced May 2022.

    Journal ref: ISVLSI (2022) 398-401

  16. arXiv:2202.12029  [pdf, other

    cs.CR cs.AR

    Systematic Prevention of On-Core Timing Channels by Full Temporal Partitioning

    Authors: Nils Wistoff, Moritz Schneider, Frank K. Gürkaynak, Gernot Heiser, Luca Benini

    Abstract: Microarchitectural timing channels enable unwanted information flow across security boundaries, violating fundamental security assumptions. They leverage timing variations of several state-holding microarchitectural components and have been demonstrated across instruction set architectures and hardware implementations. Analogously to memory protection, Ge et al. have proposed time protection for p… ▽ More

    Submitted 24 February, 2022; originally announced February 2022.

    Comments: This work has been submitted to the IEEE for possible publication. arXiv admin note: text overlap with arXiv:2005.02193

  17. arXiv:2005.02193  [pdf, other

    cs.CR cs.AR

    Prevention of Microarchitectural Covert Channels on an Open-Source 64-bit RISC-V Core

    Authors: Nils Wistoff, Moritz Schneider, Frank K. Gürkaynak, Luca Benini, Gernot Heiser

    Abstract: Covert channels enable information leakage across security boundaries of the operating system. Microarchitectural covert channels exploit changes in execution timing resulting from competing access to limited hardware resources. We use the recent experimental support for time protection, aimed at preventing covert channels, in the seL4 microkernel and evaluate the efficacy of the mechanisms agains… ▽ More

    Submitted 1 May, 2020; originally announced May 2020.

    Comments: 6 pages, 7 figures, submitted to CARRV '20, additional appendix