-
Ten Simple Rules for Catalyzing Collaborations and Building Bridges between Research Software Engineers and Software Engineering Researchers
Authors:
Nasir U. Eisty,
Jeffrey C. Carver,
Johanna Cohoon,
Ian A. Cosden,
Carole Goble,
Samuel Grayson
Abstract:
In the evolving landscape of scientific and scholarly research, effective collaboration between Research Software Engineers (RSEs) and Software Engineering Researchers (SERs) is pivotal for advancing innovation and ensuring the integrity of computational methodologies. This paper presents ten strategic guidelines aimed at fostering productive partnerships between these two distinct yet complementa…
▽ More
In the evolving landscape of scientific and scholarly research, effective collaboration between Research Software Engineers (RSEs) and Software Engineering Researchers (SERs) is pivotal for advancing innovation and ensuring the integrity of computational methodologies. This paper presents ten strategic guidelines aimed at fostering productive partnerships between these two distinct yet complementary communities. The guidelines emphasize the importance of recognizing and respecting the cultural and operational differences between RSEs and SERs, proactively initiating and nurturing collaborations, and engaging within each other's professional environments. They advocate for identifying shared challenges, maintaining openness to emerging problems, ensuring mutual benefits, and serving as advocates for one another. Additionally, the guidelines highlight the necessity of vigilance in monitoring collaboration dynamics, securing institutional support, and defining clear, shared objectives. By adhering to these principles, RSEs and SERs can build synergistic relationships that enhance the quality and impact of research outcomes.
△ Less
Submitted 3 June, 2025;
originally announced June 2025.
-
Wanted: standards for automatic reproducibility of computational experiments
Authors:
Samuel Grayson,
Reed Milewicz,
Joshua Teves,
Daniel S. Katz,
Darko Marinov
Abstract:
Those seeking to reproduce a computational experiment often need to manually look at the code to see how to build necessary libraries, configure parameters, find data, and invoke the experiment; it is not automatic. Automatic reproducibility is a more stringent goal, but working towards it would benefit the community. This work discusses a machine-readable language for specifying how to execute a…
▽ More
Those seeking to reproduce a computational experiment often need to manually look at the code to see how to build necessary libraries, configure parameters, find data, and invoke the experiment; it is not automatic. Automatic reproducibility is a more stringent goal, but working towards it would benefit the community. This work discusses a machine-readable language for specifying how to execute a computational experiment. We invite interested stakeholders to discuss this language at https://github.com/charmoniumQ/execution-description .
△ Less
Submitted 21 July, 2023;
originally announced July 2023.
-
Workflows Community Summit 2022: A Roadmap Revolution
Authors:
Rafael Ferreira da Silva,
Rosa M. Badia,
Venkat Bala,
Debbie Bard,
Peer-Timo Bremer,
Ian Buckley,
Silvina Caino-Lores,
Kyle Chard,
Carole Goble,
Shantenu Jha,
Daniel S. Katz,
Daniel Laney,
Manish Parashar,
Frederic Suter,
Nick Tyler,
Thomas Uram,
Ilkay Altintas,
Stefan Andersson,
William Arndt,
Juan Aznar,
Jonathan Bader,
Bartosz Balis,
Chris Blanton,
Kelly Rosa Braghetto,
Aharon Brodutch
, et al. (80 additional authors not shown)
Abstract:
Scientific workflows have become integral tools in broad scientific computing use cases. Science discovery is increasingly dependent on workflows to orchestrate large and complex scientific experiments that range from execution of a cloud-based data preprocessing pipeline to multi-facility instrument-to-edge-to-HPC computational workflows. Given the changing landscape of scientific computing and t…
▽ More
Scientific workflows have become integral tools in broad scientific computing use cases. Science discovery is increasingly dependent on workflows to orchestrate large and complex scientific experiments that range from execution of a cloud-based data preprocessing pipeline to multi-facility instrument-to-edge-to-HPC computational workflows. Given the changing landscape of scientific computing and the evolving needs of emerging scientific applications, it is paramount that the development of novel scientific workflows and system functionalities seek to increase the efficiency, resilience, and pervasiveness of existing systems and applications. Specifically, the proliferation of machine learning/artificial intelligence (ML/AI) workflows, need for processing large scale datasets produced by instruments at the edge, intensification of near real-time data processing, support for long-term experiment campaigns, and emergence of quantum computing as an adjunct to HPC, have significantly changed the functional and operational requirements of workflow systems. Workflow systems now need to, for example, support data streams from the edge-to-cloud-to-HPC enable the management of many small-sized files, allow data reduction while ensuring high accuracy, orchestrate distributed services (workflows, instruments, data movement, provenance, publication, etc.) across computing and user facilities, among others. Further, to accelerate science, it is also necessary that these systems implement specifications/standards and APIs for seamless (horizontal and vertical) integration between systems and applications, as well as enabling the publication of workflows and their associated products according to the FAIR principles. This document reports on discussions and findings from the 2022 international edition of the Workflows Community Summit that took place on November 29 and 30, 2022.
△ Less
Submitted 31 March, 2023;
originally announced April 2023.
-
A Secure Future for Open-Source Computational Science and Engineering
Authors:
Reed Milewicz,
Jeffrey Carver,
Samuel Grayson,
Travis Atkison
Abstract:
Journalists, public policy analysts, and economists have called attention to the growing importance that high-performance and scientific computing have to national security and industrial leadership. As computing continues to power scientific advances in virtually every discipline, so too does it improve our economic productivity and quality of life. The increasing social, political, and economic…
▽ More
Journalists, public policy analysts, and economists have called attention to the growing importance that high-performance and scientific computing have to national security and industrial leadership. As computing continues to power scientific advances in virtually every discipline, so too does it improve our economic productivity and quality of life. The increasing social, political, and economic importance of research software, however, has also brought the question of software security to the fore. Just as unintentional software errors can threaten the integrity of scientific studies, malicious actors could leverage vulnerabilities to alter results, exfiltrate data, and sabotage computing resources. In this editorial, the authors argue for the need to incorporate security practices and perspectives throughout the research software lifecycle, and they propose directions for future work in this space.
△ Less
Submitted 11 November, 2022;
originally announced November 2022.
-
On-Device CPU Scheduling for Sense-React Systems
Authors:
Aditi Partap,
Samuel Grayson,
Muhammad Huzaifa,
Sarita Adve,
Brighten Godfrey,
Saurabh Gupta,
Kris Hauser,
Radhika Mittal
Abstract:
Sense-react systems (e.g. robotics and AR/VR) have to take highly responsive real-time actions, driven by complex decisions involving a pipeline of sensing, perception, planning, and reaction tasks. These tasks must be scheduled on resource-constrained devices such that the performance goals and the requirements of the application are met. This is a difficult scheduling problem that requires handl…
▽ More
Sense-react systems (e.g. robotics and AR/VR) have to take highly responsive real-time actions, driven by complex decisions involving a pipeline of sensing, perception, planning, and reaction tasks. These tasks must be scheduled on resource-constrained devices such that the performance goals and the requirements of the application are met. This is a difficult scheduling problem that requires handling multiple scheduling dimensions, and variations in resource usage and availability. In practice, system designers manually tune parameters for their specific hardware and application, which results in poor generalization and increases the development burden. In this work, we highlight the emerging need for scheduling CPU resources at runtime in sense-react systems. We study three canonical applications (face tracking, robot navigation, and VR) to first understand the key scheduling requirements for such systems. Armed with this understanding, we develop a scheduling framework, Catan, that dynamically schedules compute resources across different components of an app so as to meet the specified application requirements. Through experiments with a prototype implemented on a widely-used robotics framework (ROS) and an open-source AR/VR platform, we show the impact of system scheduling on meeting the performance goals for the three applications, how Catan is able to achieve better application performance than hand-tuned configurations, and how it dynamically adapts to runtime variations.
△ Less
Submitted 14 August, 2022; v1 submitted 27 July, 2022;
originally announced July 2022.
-
A Case for Fine-grain Coherence Specialization in Heterogeneous Systems
Authors:
Johnathan Alsop,
Weon Taek Na,
Matthew D. Sinclair,
Samuel Grayson,
Sarita V. Adve
Abstract:
Hardware specialization is becoming a key enabler of energyefficient performance. Future systems will be increasingly heterogeneous, integrating multiple specialized and programmable accelerators, each with different memory demands. Traditionally, communication between accelerators has been inefficient, typically orchestrated through explicit DMA transfers between different address spaces. More re…
▽ More
Hardware specialization is becoming a key enabler of energyefficient performance. Future systems will be increasingly heterogeneous, integrating multiple specialized and programmable accelerators, each with different memory demands. Traditionally, communication between accelerators has been inefficient, typically orchestrated through explicit DMA transfers between different address spaces. More recently, industry has proposed unified coherent memory which enables implicit data movement and more data reuse, but often these interfaces limit the coherence flexibility available to heterogeneous systems. This paper demonstrates the benefits of fine-grained coherence specialization for heterogeneous systems. We propose an architecture that enables low-complexity independent specialization of each individual coherence request in heterogeneous workloads by building upon a simple and flexible baseline coherence interface, Spandex. We then describe how to optimize individual memory requests to improve cache reuse and performance-critical memory latency in emerging heterogeneous workloads. Collectively, our techniques enable significant gains, reducing execution time by up to 61% or network traffic by up to 99% while adding minimal complexity to the Spandex protocol.
△ Less
Submitted 23 April, 2021;
originally announced April 2021.
-
Exploring Extended Reality with ILLIXR: A New Playground for Architecture Research
Authors:
Muhammad Huzaifa,
Rishi Desai,
Samuel Grayson,
Xutao Jiang,
Ying Jing,
Jae Lee,
Fang Lu,
Yihan Pang,
Joseph Ravichandran,
Finn Sinclair,
Boyuan Tian,
Hengzhi Yuan,
Jeffrey Zhang,
Sarita V. Adve
Abstract:
As we enter the era of domain-specific architectures, systems researchers must understand the requirements of emerging application domains. Augmented and virtual reality (AR/VR) or extended reality (XR) is one such important domain. This paper presents ILLIXR, the first open source end-to-end XR system (1) with state-of-the-art components, (2) integrated with a modular and extensible multithreaded…
▽ More
As we enter the era of domain-specific architectures, systems researchers must understand the requirements of emerging application domains. Augmented and virtual reality (AR/VR) or extended reality (XR) is one such important domain. This paper presents ILLIXR, the first open source end-to-end XR system (1) with state-of-the-art components, (2) integrated with a modular and extensible multithreaded runtime, (3) providing an OpenXR compliant interface to XR applications (e.g., game engines), and (4) with the ability to report (and trade off) several quality of experience (QoE) metrics. We analyze performance, power, and QoE metrics for the complete ILLIXR system and for its individual components. Our analysis reveals several properties with implications for architecture and systems research. These include demanding performance, power, and QoE requirements, a large diversity of critical tasks, inter-dependent execution pipelines with challenges in scheduling and resource management, and a large tradeoff space between performance/power and human perception related QoE metrics. ILLIXR and our analysis have the potential to propel new directions in architecture and systems research in general, and impact XR in particular. ILLIXR is open-source and available at https://illixr.github.io
△ Less
Submitted 2 March, 2021; v1 submitted 25 March, 2020;
originally announced April 2020.
-
Temporal Analysis of Reddit Networks via Role Embeddings
Authors:
Siobhan Grayson,
Derek Greene
Abstract:
Inspired by diachronic word analysis from the field of natural language processing, we propose an approach for uncovering temporal insights regarding user roles from social networks using graph embedding methods. Specifically, we apply the role embedding algorithm, struc2vec, to a collection of social networks exhibiting either "loyal" or "vagrant" characteristics derived from the popular online s…
▽ More
Inspired by diachronic word analysis from the field of natural language processing, we propose an approach for uncovering temporal insights regarding user roles from social networks using graph embedding methods. Specifically, we apply the role embedding algorithm, struc2vec, to a collection of social networks exhibiting either "loyal" or "vagrant" characteristics derived from the popular online social news aggregation website Reddit. For each subreddit, we extract nine months of data and create network role embeddings on consecutive time windows. We are then able to compare and contrast how user roles change over time by aligning the resulting temporal embeddings spaces. In particular, we analyse temporal role embeddings from an individual and a community-level perspective for both loyal and vagrant communities present on Reddit.
△ Less
Submitted 14 August, 2019;
originally announced August 2019.