Skip to main content

Showing 1–12 of 12 results for author: Armour, W

Searching in archive cs. Search in all archives.
.
  1. arXiv:2502.21163  [pdf, other

    cs.CV

    Adaptive Illumination-Invariant Synergistic Feature Integration in a Stratified Granular Framework for Visible-Infrared Re-Identification

    Authors: Yuheng Jia, Wesley Armour

    Abstract: Visible-Infrared Person Re-Identification (VI-ReID) plays a crucial role in applications such as search and rescue, infrastructure protection, and nighttime surveillance. However, it faces significant challenges due to modality discrepancies, varying illumination, and frequent occlusions. To overcome these obstacles, we propose \textbf{AMINet}, an Adaptive Modality Interaction Network. AMINet empl… ▽ More

    Submitted 28 February, 2025; originally announced February 2025.

  2. arXiv:2502.16631  [pdf, other

    cs.DC

    CRIUgpu: Transparent Checkpointing of GPU-Accelerated Workloads

    Authors: Radostin Stoyanov, Viktória Spišaková, Jesus Ramos, Steven Gurfinkel, Andrei Vagin, Adrian Reber, Wesley Armour, Rodrigo Bruno

    Abstract: Deep learning training at scale is resource-intensive and time-consuming, often running across hundreds or thousands of GPUs for weeks or months. Efficient checkpointing is crucial for running these workloads, especially in multi-tenant environments where compute resources are shared, and job preemptions or interruptions are common. However, transparent and unified GPU snapshots are particularly c… ▽ More

    Submitted 23 February, 2025; originally announced February 2025.

  3. arXiv:2412.09731  [pdf, other

    cs.CV

    Double-Exponential Increases in Inference Energy: The Cost of the Race for Accuracy

    Authors: Zeyu Yang, Karel Adamek, Wesley Armour

    Abstract: Deep learning models in computer vision have achieved significant success but pose increasing concerns about energy consumption and sustainability. Despite these concerns, there is a lack of comprehensive understanding of their energy efficiency during inference. In this study, we conduct a comprehensive analysis of the inference energy consumption of 1,200 ImageNet classification models - the lar… ▽ More

    Submitted 12 December, 2024; originally announced December 2024.

  4. arXiv:2402.12396  [pdf, ps, other

    astro-ph.HE cs.AI

    Toward using GANs in astrophysical Monte-Carlo simulations

    Authors: Ahab Isaac, Wesley Armour, Karel Adámek

    Abstract: Accurate modelling of spectra produced by X-ray sources requires the use of Monte-Carlo simulations. These simulations need to evaluate physical processes, such as those occurring in accretion processes around compact objects by sampling a number of different probability distributions. This is computationally time-consuming and could be sped up if replaced by neural networks. We demonstrate, on an… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

    Comments: Proceedings of ADASS XXXIII (2023)

    ACM Class: I.2.0

  5. Part-time Power Measurements: nvidia-smi's Lack of Attention

    Authors: Zeyu Yang, Karel Adamek, Wesley Armour

    Abstract: The GPU has emerged as the go-to accelerator for high throughput and parallel workloads, spanning scientific simulations to AI, thanks to its performance and power efficiency. Given that 6 out of the top 10 fastest supercomputers in the world use NVIDIA GPUs and many AI companies each employ 10,000's of NVIDIA GPUs, an accurate understanding of GPU power consumption is essential for making progres… ▽ More

    Submitted 12 December, 2024; v1 submitted 5 December, 2023; originally announced December 2023.

  6. arXiv:2311.05341  [pdf, other

    astro-ph.IM cs.DC

    Accelerating Dedispersion using Many-Core Architectures

    Authors: Jan Novotný, Karel Adámek, M. A. Clark, Mike Giles, Wesley Armour

    Abstract: Astrophysical radio signals are excellent probes of extreme physical processes that emit them. However, to reach Earth, electromagnetic radiation passes through the ionised interstellar medium (ISM), introducing a frequency-dependent time delay (dispersion) to the emitted signal. Removing dispersion enables searches for transient signals like Fast Radio Bursts (FRB) or repeating signals from isola… ▽ More

    Submitted 9 November, 2023; originally announced November 2023.

    Journal ref: The Astrophysical Journal Supplement Series, Volume 269, Number 1, 2023

  7. arXiv:2211.13517  [pdf, ps, other

    astro-ph.IM cs.DC

    Cutting the cost of pulsar astronomy: Saving time and energy when searching for binary pulsars using NVIDIA GPUs

    Authors: Jack White, Karel Adamek, Wes Armour

    Abstract: Using the Fourier Domain Acceleration Search (FDAS) method to search for binary pulsars is a computationally costly process. Next generation radio telescopes will have to perform FDAS in real time, as data volumes are too large to store. FDAS is a matched filtering approach for searching time-domain radio astronomy datasets for the signatures of binary pulsars with approximately linear acceleratio… ▽ More

    Submitted 24 November, 2022; originally announced November 2022.

  8. arXiv:2101.00941  [pdf, ps, other

    astro-ph.IM cs.DC

    Implementing CUDA Streams into AstroAccelerate -- A Case Study

    Authors: Jan Novotný, Karel Adámek, Wes Armour

    Abstract: To be able to run tasks asynchronously on NVIDIA GPUs a programmer must explicitly implement asynchronous execution in their code using the syntax of CUDA streams. Streams allow a programmer to launch independent concurrent execution tasks, providing the ability to utilise different functional units on the GPU asynchronously. For example, it is possible to transfer the results from a previous comp… ▽ More

    Submitted 6 May, 2021; v1 submitted 4 January, 2021; originally announced January 2021.

    Comments: submitted to ADASS XXX, 3 pages

  9. Efficiency Near the Edge: Increasing the Energy Efficiency of FFTs on GPUs for Real-time Edge Computing

    Authors: Karel Adámek, Jan Novotný, Jeyarajan Thiyagalingam, Wesley Armour

    Abstract: The Square Kilometre Array (SKA) is an international initiative for developing the world's largest radio telescope with a total collecting area of over a million square meters. The scale of the operation, combined with the remote location of the telescope, requires the use of energy-efficient computational algorithms. This, along with the extreme data rates that will be produced by the SKA and the… ▽ More

    Submitted 9 November, 2021; v1 submitted 13 September, 2020; originally announced September 2020.

    Comments: published in IEEE Access

    ACM Class: C.4

    Journal ref: in IEEE Access, vol. 9, pp. 18167-18182, 2021

  10. arXiv:1910.01972  [pdf, ps, other

    cs.MS cs.DC cs.PF

    GPU Fast Convolution via the Overlap-and-Save Method in Shared Memory

    Authors: Karel Adámek, Sofia Dimoudi, Mike Giles, Wesley Armour

    Abstract: We present an implementation of the overlap-and-save method, a method for the convolution of very long signals with short response functions, which is tailored to GPUs. We have implemented several FFT algorithms (using the CUDA programming language) which exploit GPU shared memory, allowing for GPU accelerated convolution. We compare our implementation with an implementation of the overlap-and-sav… ▽ More

    Submitted 10 April, 2020; v1 submitted 4 October, 2019; originally announced October 2019.

    Comments: accepted to ACM TACO

    Journal ref: ACM Trans. Archit. Code Optim. 17, 3, Article 18 (September 2020)

  11. A polyphase filter for many-core architectures

    Authors: Karel Adámek, Jan Novotný, Wes Armour

    Abstract: In this article we discuss our implementation of a polyphase filter for real-time data processing in radio astronomy. We describe in detail our implementation of the polyphase filter algorithm and its behaviour on three generations of NVIDIA GPU cards, on dual Intel Xeon CPUs and the Intel Xeon Phi (Knights Corner) platforms. All of our implementations aim to exploit the potential for data reuse t… ▽ More

    Submitted 21 April, 2016; v1 submitted 11 November, 2015; originally announced November 2015.

    Comments: 19 pages, 20 figures, 5 tables

  12. arXiv:1411.3656  [pdf, other

    cs.DC cs.PF

    The Implementation of a Real-Time Polyphase Filter

    Authors: Karel Adámek, Jan Novotný, Wes Armour

    Abstract: In this article we study the suitability of dierent computational accelerators for the task of real-time data processing. The algorithm used for comparison is the polyphase filter, a standard tool in signal processing and a well established algorithm. We measure performance in FLOPs and execution time, which is a critical factor for real-time systems. For our real-time studies we have chosen a dat… ▽ More

    Submitted 12 November, 2014; originally announced November 2014.

    Comments: Proceedings of WDS 2014, Charles University in Prague, Faculty of Mathematics and Physics Troja, Prague