Skip to main content

Showing 1–19 of 19 results for author: Skjellum, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2503.11138  [pdf, other

    cs.DC

    The Case for ABI Interoperability in a Fault Tolerant MPI

    Authors: Yao Xu, Grace Nansamba, Anthony Skjellum, Gene Cooperman

    Abstract: There is new momentum behind an interoperable ABI for MPI, which will be a major component of MPI-5. This capability brings true separation of concerns to a running MPI computation. The linking and compilation of an MPI application becomes completely independent of the choice of MPI library. The MPI application is compiled once, and runs everywhere. This ABI allows users to independently choose:… ▽ More

    Submitted 14 March, 2025; originally announced March 2025.

  2. arXiv:2501.18749  [pdf, other

    cs.AR

    ACiS: Complex Processing in the Switch Fabric

    Authors: Pouya Haghi, Anqi Guo, Tong Geng, Anthony Skjellum, Martin Herbordt

    Abstract: For the last three decades a core use of FPGAs has been for processing communication: FPGA-based SmartNICs are in widespread use from the datacenter to IoT. Augmenting switches with FPGAs, however, has been less studied, but has numerous advantages built around the processing being moved from the edge of the network to the center. Communication switches have previously been augmented to process co… ▽ More

    Submitted 30 January, 2025; originally announced January 2025.

  3. arXiv:2406.05594  [pdf, ps, other

    cs.DC

    Understanding GPU Triggering APIs for MPI+X Communication

    Authors: Patrick G. Bridges, Anthony Skjellum, Evan D. Suggs, Derek Schafer, Purushotham V. Bangalore

    Abstract: GPU-enhanced architectures are now dominant in HPC systems, but message-passing communication involving GPUs with MPI has proven to be both complex and expensive, motivating new approaches that lower such costs. We compare and contrast stream/graph- and kernel-triggered MPI communication abstractions, whose principal purpose is to enhance the performance of communication when GPU kernels create or… ▽ More

    Submitted 31 July, 2024; v1 submitted 8 June, 2024; originally announced June 2024.

  4. arXiv:2402.12203  [pdf, other

    cs.DC cs.PF cs.SE

    MPI Implementation Profiling for Better Application Performance

    Authors: Riley Shipley, Garrett Hooten, David Boehme, Derek Schafer, Anthony Skjellum, Olga Pearce

    Abstract: While application profiling has been a mainstay in the HPC community for years, profiling of MPI and other communication middleware has not received the same degree of exploration. This paper adds to the discussion of MPI profiling, contributing two general-purpose profiling methods as well as practical applications of these methods to an existing implementation. The ability to detect performance… ▽ More

    Submitted 19 February, 2024; originally announced February 2024.

    Comments: 7 pages, 11 figures

  5. arXiv:2309.14996  [pdf, other

    cs.DC

    Implementation-Oblivious Transparent Checkpoint-Restart for MPI

    Authors: Yao Xu, Leonid Belyaev, Twinkle Jain, Derek Schafer, Anthony Skjellum, Gene Cooperman

    Abstract: This work presents experience with traditional use cases of checkpointing on a novel platform. A single codebase (MANA) transparently checkpoints production workloads for major available MPI implementations: "develop once, run everywhere". The new platform enables application developers to compile their application against any of the available standards-compliant MPI implementations, and test each… ▽ More

    Submitted 26 September, 2023; originally announced September 2023.

    Comments: 17 pages, 4 figures

  6. arXiv:2309.07337  [pdf, other

    cs.DC

    MPI Advance : Open-Source Message Passing Optimizations

    Authors: Amanda Bienz, Derek Schafer, Anthony Skjellum

    Abstract: The large variety of production implementations of the message passing interface (MPI) each provide unique and varying underlying algorithms. Each emerging supercomputer supports one or a small number of system MPI installations, tuned for the given architecture. Performance varies with MPI version, but application programmers are typically unable to achieve optimal performance with local MPI inst… ▽ More

    Submitted 13 September, 2023; originally announced September 2023.

    Comments: Available on conference website : https://eurompi23.github.io/assets/papers/EuroMPI23_paper_33.pdf

  7. arXiv:2307.07828  [pdf, other

    cs.DC cs.DS

    The Impact of Space-Filling Curves on Data Movement in Parallel Systems

    Authors: David Walker, Anthony Skjellum

    Abstract: Modern computer systems are characterized by deep memory hierarchies, composed of main memory, multiple layers of cache, and other specialized types of memory. In parallel and distributed systems, additional memory layers are added to this hierarchy. Achieving good performance for computational science applications, in terms of execution time, depends on the efficient use of this diverse and hiera… ▽ More

    Submitted 15 July, 2023; originally announced July 2023.

    Report number: CUPECS-2023-19 ACM Class: D.1.3; E.2

  8. arXiv:2306.16589  [pdf, other

    cs.MS cs.PF

    Collective-Optimized FFTs

    Authors: Evelyn Namugwanya, Amanda Bienz, Derek Schafer, Anthony Skjellum

    Abstract: This paper measures the impact of the various alltoallv methods. Results are analyzed within Beatnik, a Z-model solver that is bottlenecked by HeFFTe and representative of applications that rely on FFTs.

    Submitted 4 July, 2023; v1 submitted 28 June, 2023; originally announced June 2023.

  9. arXiv:2305.19946  [pdf, other

    cs.DC

    A Survey of Potential MPI Complex Collectives: Large-Scale Mining and Analysis of HPC Applications

    Authors: Pouya Haghi, Ryan Marshall, Po Hao Chen, Anthony Skjellum, Martin Herbordt

    Abstract: Offload of MPI collectives to network devices, e.g., NICs and switches, is being implemented as an effective mechanism to improve application performance by reducing inter- and intra-node communication and bypassing MPI software layers. Given the rich deployment of accelerators and programmable NICs/switches in data centers, we posit that there is an opportunity to further improve performance by e… ▽ More

    Submitted 31 May, 2023; originally announced May 2023.

  10. arXiv:2112.10814  [pdf, ps, other

    cs.DC

    Checkpoint-Restart Libraries Must Become More Fault Tolerant

    Authors: Anthony Skjellum, Derek Schafer

    Abstract: Production MPI codes need checkpoint-restart (CPR) support. Clearly, checkpoint-restart libraries must be fault tolerant lest they open up a window of vulnerability for failures with byzantine outcomes. But, certain popular libraries that leverage MPI are evidently not fault tolerant. Nowadays, fault detection with automatic recovery without batch requeueing is a strong requirement for production… ▽ More

    Submitted 20 December, 2021; originally announced December 2021.

    Comments: Short paper accepted at SuperCheck at SC21

  11. arXiv:2109.05649  [pdf, other

    cs.CR

    Scrybe: A Secure Audit Trail for Clinical Trial Data Fusion

    Authors: Jon Oakley, Carl Worley, Lu Yu, Richard Brooks, Ilker Ozcelik, Anthony Skjellum, Jihad Obeid

    Abstract: Clinical trials are a multi-billion dollar industry. One of the biggest challenges facing the clinical trial research community is satisfying Part 11 of Title 21 of the Code of Federal Regulations and ISO 27789. These controls provide audit requirements that guarantee the reliability of the data contained in the electronic records. Context-aware smart devices and wearable IoT devices have become i… ▽ More

    Submitted 12 September, 2021; originally announced September 2021.

  12. arXiv:2107.10566  [pdf, ps, other

    cs.PL

    MPIs Language Bindings are Holding MPI Back

    Authors: Martin Ruefenacht, Derek Schafer, Anthony Skjellum, Purushotham V. Bangalore

    Abstract: Over the past two decades, C++ has been adopted as a major HPC language (displacing C to a large extent, andFortran to some degree as well). Idiomatic C++ is clearly how C++ is being used nowadays. But, MPIs syntax and semantics defined and extended with C and Fortran interfaces that align with the capabilities and limitations of C89 and Fortran-77.Unfortunately, the language-independent specifica… ▽ More

    Submitted 22 July, 2021; originally announced July 2021.

  13. An Overview of Cryptographic Accumulators

    Authors: Ilker Ozcelik, Sai Medury, Justin Broaddus, Anthony Skjellum

    Abstract: This paper is a primer on cryptographic accumulators and how to apply them practically. A cryptographic accumulator is a space- and time-efficient data structure used for set-membership tests. Since it is possible to represent any computational problem where the answer is yes or no as a set-membership problem, cryptographic accumulators are invaluable data structures in computer science and engine… ▽ More

    Submitted 7 March, 2021; originally announced March 2021.

    Comments: Note: This is an extended version of a paper published In Proceedings of the 7th International Conference on Information Systems Security and Privacy (ICISSP 2021), pages 661-669

  14. arXiv:2005.09503  [pdf, ps, other

    eess.SP cs.CR cs.LG

    Pre-print: Radio Identity Verification-based IoT Security Using RF-DNA Fingerprints and SVM

    Authors: Donald Reising, Joseph Cancelleri, T. Daniel Loveless, Farah Kandah, Anthony Skjellum

    Abstract: It is estimated that the number of IoT devices will reach 75 billion in the next five years. Most of those currently, and to be deployed, lack sufficient security to protect themselves and their networks from attack by malicious IoT devices that masquerade as authorized devices to circumvent digital authentication approaches. This work presents a PHY layer IoT authentication approach capable of ad… ▽ More

    Submitted 19 May, 2020; originally announced May 2020.

    Comments: 14 pages, 23 figures and sub-figures, Submitted to the IEEE Internet of Things Journal on May 19, 2020

    Journal ref: IEEE Internet of Things Journal 2021

  15. arXiv:1909.11762  [pdf, other

    cs.DC

    Extending the Message Passing Interface (MPI) with User-Level Schedules

    Authors: Derek Schafer, Sheikh Ghafoor, Daniel Holmes, Martin Ruefenacht, Anthony Skjellum

    Abstract: Composability is one of seven reasons for the long-standing and continuing success of MPI. Extending MPI by composing its operations with user-level operations provides useful integration with the progress engine and completion notification methods of MPI. However, the existing extensibility mechanism in MPI (generalized requests) is not widely utilized and has significant drawbacks. MPI can be… ▽ More

    Submitted 25 September, 2019; originally announced September 2019.

  16. arXiv:1707.04788  [pdf, ps, other

    cs.DC

    MPIgnite: An MPI-Like Language and Prototype Implementation for Apache Spark

    Authors: Brandon L. Morris, Anthony Skjellum

    Abstract: Scale-out parallel processing based on MPI is a 25-year-old standard with at least another decade of preceding history of enabling technologies in the High Performance Computing community. Newer frameworks such as MapReduce, Hadoop, and Spark represent industrial scalable computing solutions that have received broad adoption because of their comparative simplicity of use, applicability to relevant… ▽ More

    Submitted 15 July, 2017; originally announced July 2017.

  17. Provenance Threat Modeling

    Authors: Oluwakemi Hambolu, Lu Yu, Jon Oakley, Richard R. Brooks, Ujan Mukhopadhyay, Anthony Skjellum

    Abstract: Provenance systems are used to capture history metadata, applications include ownership attribution and determining the quality of a particular data set. Provenance systems are also used for debugging, process improvement, understanding data proof of ownership, certification of validity, etc. The provenance of data includes information about the processes and source data that leads to the current… ▽ More

    Submitted 10 March, 2017; originally announced March 2017.

    Comments: 4 pages, 1 figure, conference

    ACM Class: C.2.0

  18. arXiv:1604.01416  [pdf, other

    cs.NE cs.DC cs.MS

    dMath: A Scalable Linear Algebra and Math Library for Heterogeneous GP-GPU Architectures

    Authors: Steven Eliuk, Cameron Upright, Anthony Skjellum

    Abstract: A new scalable parallel math library, dMath, is presented in this paper that demonstrates leading scaling when using intranode, or internode, hybrid-parallelism for deep-learning. dMath provides easy-to-use distributed base primitives and a variety of domain-specific algorithms. These include matrix multiplication, convolutions, and others allowing for rapid development of highly scalable applicat… ▽ More

    Submitted 5 April, 2016; originally announced April 2016.

  19. arXiv:1107.1525  [pdf, ps, other

    cs.IT cs.GR cs.PF

    Accelerating Lossless Data Compression with GPUs

    Authors: R. L. Cloud, M. L. Curry, H. L. Ward, A. Skjellum, P. Bangalore

    Abstract: Huffman compression is a statistical, lossless, data compression algorithm that compresses data by assigning variable length codes to symbols, with the more frequently appearing symbols given shorter codes than the less. This work is a modification of the Huffman algorithm which permits uncompressed data to be decomposed into indepen- dently compressible and decompressible blocks, allowing for con… ▽ More

    Submitted 21 June, 2011; originally announced July 2011.

    Comments: peer reviewed and published in undergraduate research journal Inquiro in 2009 after Summer work in 2009

    Journal ref: Inquiro, Volume 3, 2009, p. 26 - 29