Skip to main content

Showing 1–5 of 5 results for author: Derumigny, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2412.13207  [pdf, other

    cs.DC cs.PF

    Performance Debugging through Microarchitectural Sensitivity and Causality Analysis

    Authors: Alban Dutilleul, Hugo Pompougnac, Nicolas Derumigny, Gabriel Rodriguez, Valentin Trophime, Christophe Guillon, Fabrice Rastello

    Abstract: Modern Out-of-Order (OoO) CPUs are complex systems with many components interleaved in non-trivial ways. Pinpointing performance bottlenecks and understanding the underlying causes of program performance issues are critical tasks to fully exploit the performance offered by hardware resources. Current performance debugging approaches rely either on measuring resource utilization, in order to esti… ▽ More

    Submitted 3 December, 2024; originally announced December 2024.

  2. arXiv:2402.15773  [pdf, other

    cs.PF

    Performance bottlenecks detection through microarchitectural sensitivity

    Authors: Hugo Pompougnac, Alban Dutilleul, Christophe Guillon, Nicolas Derumigny, Fabrice Rastello

    Abstract: Modern Out-of-Order (OoO) CPUs are complex systems with many components interleaved in non-trivial ways. Pinpointing performance bottlenecks and understanding the underlying causes of program performance issues are critical tasks to make the most of hardware resources. We provide an in-depth overview of performance bottlenecks in recent OoO microarchitectures and describe the difficulties of det… ▽ More

    Submitted 24 February, 2024; originally announced February 2024.

  3. arXiv:2401.12071  [pdf, other

    cs.AR

    An Irredundant and Compressed Data Layout to Optimize Bandwidth Utilization of FPGA Accelerators

    Authors: Corentin Ferry, Nicolas Derumigny, Steven Derrien, Sanjay Rajopadhye

    Abstract: Memory bandwidth is known to be a performance bottleneck for FPGA accelerators, especially when they deal with large multi-dimensional data-sets. A large body of work focuses on reducing of off-chip transfers, but few authors try to improve the efficiency of transfers. This paper addresses the later issue by proposing (i) a compiler-based approach to accelerator's data layout to maximize contiguou… ▽ More

    Submitted 22 January, 2024; originally announced January 2024.

    Comments: 11 pages, 11 figures, 2 tables

  4. arXiv:2012.11473  [pdf, other

    cs.AR cs.PF

    PALMED: Throughput Characterization for Superscalar Architectures -- Extended Version

    Authors: Nicolas Derumigny, Fabian Gruber, Théophile Bastian, Guillaume Iooss, Christophe Guillon, Louis-Noël Pouchet, Fabrice Rastello

    Abstract: In a super-scalar architecture, the scheduler dynamically assigns micro-operations ($μ$OPs) to execution ports. The port mapping of an architecture describes how an instruction decomposes into $μ$OPs and lists for each $μ$OP the set of ports it can be mapped to. It is used by compilers and performance debugging tools to characterize the performance throughput of a sequence of instructions repeated… ▽ More

    Submitted 18 January, 2022; v1 submitted 21 December, 2020; originally announced December 2020.

  5. arXiv:2007.03152  [pdf, other

    cs.AR

    The gem5 Simulator: Version 20.0+

    Authors: Jason Lowe-Power, Abdul Mutaal Ahmad, Ayaz Akram, Mohammad Alian, Rico Amslinger, Matteo Andreozzi, Adrià Armejach, Nils Asmussen, Brad Beckmann, Srikant Bharadwaj, Gabe Black, Gedare Bloom, Bobby R. Bruce, Daniel Rodrigues Carvalho, Jeronimo Castrillon, Lizhong Chen, Nicolas Derumigny, Stephan Diestelhorst, Wendy Elsasser, Carlos Escuin, Marjan Fariborz, Amin Farmahini-Farahani, Pouya Fotouhi, Ryan Gambord, Jayneel Gandhi , et al. (53 additional authors not shown)

    Abstract: The open-source and community-supported gem5 simulator is one of the most popular tools for computer architecture research. This simulation infrastructure allows researchers to model modern computer hardware at the cycle level, and it has enough fidelity to boot unmodified Linux-based operating systems and run full applications for multiple architectures including x86, Arm, and RISC-V. The gem5 si… ▽ More

    Submitted 29 September, 2020; v1 submitted 6 July, 2020; originally announced July 2020.

    Comments: Source, comments, and feedback: https://github.com/darchr/gem5-20-paper