Skip to main content

Showing 1–9 of 9 results for author: Thain, D

.
  1. arXiv:2506.07838  [pdf, ps, other

    cs.DC

    A Terminology for Scientific Workflow Systems

    Authors: Frédéric Suter, Tainã Coleman, İlkay Altintaş, Rosa M. Badia, Bartosz Balis, Kyle Chard, Iacopo Colonnelli, Ewa Deelman, Paolo Di Tommaso, Thomas Fahringer, Carole Goble, Shantenu Jha, Daniel S. Katz, Johannes Köster, Ulf Leser, Kshitij Mehta, Hilary Oliver, J. -Luc Peterson, Giovanni Pizzi, Loïc Pottier, Raül Sirvent, Eric Suchyta, Douglas Thain, Sean R. Wilkinson, Justin M. Wozniak , et al. (1 additional authors not shown)

    Abstract: The term scientific workflow has evolved over the last two decades to encompass a broad range of compositions of interdependent compute tasks and data movements. It has also become an umbrella term for processing in modern scientific applications. Today, many scientific applications can be considered as workflows made of multiple dependent steps, and hundreds of workflow management systems (WMSs)… ▽ More

    Submitted 10 June, 2025; v1 submitted 9 June, 2025; originally announced June 2025.

  2. Workflows Community Summit 2022: A Roadmap Revolution

    Authors: Rafael Ferreira da Silva, Rosa M. Badia, Venkat Bala, Debbie Bard, Peer-Timo Bremer, Ian Buckley, Silvina Caino-Lores, Kyle Chard, Carole Goble, Shantenu Jha, Daniel S. Katz, Daniel Laney, Manish Parashar, Frederic Suter, Nick Tyler, Thomas Uram, Ilkay Altintas, Stefan Andersson, William Arndt, Juan Aznar, Jonathan Bader, Bartosz Balis, Chris Blanton, Kelly Rosa Braghetto, Aharon Brodutch , et al. (80 additional authors not shown)

    Abstract: Scientific workflows have become integral tools in broad scientific computing use cases. Science discovery is increasingly dependent on workflows to orchestrate large and complex scientific experiments that range from execution of a cloud-based data preprocessing pipeline to multi-facility instrument-to-edge-to-HPC computational workflows. Given the changing landscape of scientific computing and t… ▽ More

    Submitted 31 March, 2023; originally announced April 2023.

    Report number: ORNL/TM-2023/2885

  3. arXiv:2203.08811  [pdf, other

    physics.data-an hep-ex

    Analysis Cyberinfrastructure: Challenges and Opportunities

    Authors: Kevin Lannon, Paul Brenner, Mike Hildreth, Kenyi Hurtado Anampa, Alan Malta Rodrigues, Kelci Mohrman, Doug Thain, Benjamin Tovar

    Abstract: Analysis cyberinfrastructure refers to the combination of software and computer hardware used to support late-stage data analysis in High Energy Physics (HEP). For the purposes of this white paper, late-stage data analysis refers specifically to the step of transforming the most reduced common data format produced by a given experimental collaboration (for example, nanoAOD for the CMS experiment)… ▽ More

    Submitted 15 March, 2022; originally announced March 2022.

    Comments: contribution to Snowmass 2021

  4. A Community Roadmap for Scientific Workflows Research and Development

    Authors: Rafael Ferreira da Silva, Henri Casanova, Kyle Chard, Ilkay Altintas, Rosa M Badia, Bartosz Balis, Tainã Coleman, Frederik Coppens, Frank Di Natale, Bjoern Enders, Thomas Fahringer, Rosa Filgueira, Grigori Fursin, Daniel Garijo, Carole Goble, Dorran Howell, Shantenu Jha, Daniel S. Katz, Daniel Laney, Ulf Leser, Maciej Malawski, Kshitij Mehta, Loïc Pottier, Jonathan Ozik, J. Luc Peterson , et al. (4 additional authors not shown)

    Abstract: The landscape of workflow systems for scientific applications is notoriously convoluted with hundreds of seemingly equivalent workflow systems, many isolated research claims, and a steep learning curve. To address some of these challenges and lay the groundwork for transforming workflows research and development, the WorkflowsRI and ExaWorks projects partnered to bring the international workflows… ▽ More

    Submitted 8 October, 2021; v1 submitted 5 October, 2021; originally announced October 2021.

    Comments: arXiv admin note: substantial text overlap with arXiv:2103.09181

  5. Workflows Community Summit: Advancing the State-of-the-art of Scientific Workflows Management Systems Research and Development

    Authors: Rafael Ferreira da Silva, Henri Casanova, Kyle Chard, Tainã Coleman, Dan Laney, Dong Ahn, Shantenu Jha, Dorran Howell, Stian Soiland-Reys, Ilkay Altintas, Douglas Thain, Rosa Filgueira, Yadu Babuji, Rosa M. Badia, Bartosz Balis, Silvina Caino-Lores, Scott Callaghan, Frederik Coppens, Michael R. Crusoe, Kaushik De, Frank Di Natale, Tu M. A. Do, Bjoern Enders, Thomas Fahringer, Anne Fouilloux , et al. (33 additional authors not shown)

    Abstract: Scientific workflows are a cornerstone of modern scientific computing, and they have underpinned some of the most significant discoveries of the last decade. Many of these workflows have high computational, storage, and/or communication demands, and thus must execute on a wide range of large-scale platforms, from large clouds to upcoming exascale HPC platforms. Workflows will play a crucial role i… ▽ More

    Submitted 9 June, 2021; originally announced June 2021.

  6. Workflows Community Summit: Bringing the Scientific Workflows Community Together

    Authors: Rafael Ferreira da Silva, Henri Casanova, Kyle Chard, Dan Laney, Dong Ahn, Shantenu Jha, Carole Goble, Lavanya Ramakrishnan, Luc Peterson, Bjoern Enders, Douglas Thain, Ilkay Altintas, Yadu Babuji, Rosa M. Badia, Vivien Bonazzi, Taina Coleman, Michael Crusoe, Ewa Deelman, Frank Di Natale, Paolo Di Tommaso, Thomas Fahringer, Rosa Filgueira, Grigori Fursin, Alex Ganose, Bjorn Gruning , et al. (20 additional authors not shown)

    Abstract: Scientific workflows have been used almost universally across scientific domains, and have underpinned some of the most significant discoveries of the past several decades. Many of these workflows have high computational, storage, and/or communication demands, and thus must execute on a wide range of large-scale platforms, from large clouds to upcoming exascale high-performance computing (HPC) pla… ▽ More

    Submitted 16 March, 2021; originally announced March 2021.

  7. arXiv:1707.01428  [pdf, other

    cs.LG cs.DC

    SHADHO: Massively Scalable Hardware-Aware Distributed Hyperparameter Optimization

    Authors: Jeff Kinnison, Nathaniel Kremer-Herman, Douglas Thain, Walter Scheirer

    Abstract: Computer vision is experiencing an AI renaissance, in which machine learning models are expediting important breakthroughs in academic research and commercial applications. Effectively training these models, however, is not trivial due in part to hyperparameters: user-configured values that control a model's ability to learn from data. Existing hyperparameter optimization methods are highly parall… ▽ More

    Submitted 22 January, 2018; v1 submitted 5 July, 2017; originally announced July 2017.

    Comments: 10 pages, 6 figures

  8. arXiv:1604.04638  [pdf, other

    cs.SE cs.DC

    DISTEA: Efficient Dynamic Impact Analysis for Distributed Systems

    Authors: Haipeng Cai, Douglas Thain

    Abstract: Dynamic impact analysis is a fundamental technique for understanding the impact of specific program entities, or changes to them, on the rest of the program for concrete executions. However, existing techniques are either inapplicable or of very limited utility for distributed programs running in multiple concurrent processes. This paper presents DISTEA, a technique and tool for dynamic impact ana… ▽ More

    Submitted 15 April, 2016; originally announced April 2016.

    Comments: 12 pages, 4 figures, 4 tables

    Journal ref: IEEE/ACM Automated Software Engineering (ASE), 2016

  9. arXiv:1406.7588  [pdf

    cs.SI cs.CY

    Lessons Learned from an Experiment in Crowdsourcing Complex Citizen Engineering Tasks with Amazon Mechanical Turk

    Authors: Matthew Staffelbach, Peter Sempolinski, David Hachen, Ahsan Kareem, Tracy Kijewski-Correa, Douglas Thain, Daniel Wei, Greg Madey

    Abstract: We investigate the feasibility of obtaining highly trustworthy results using crowdsourcing on complex engineering tasks. Crowdsourcing is increasingly seen as a potentially powerful way of increasing the supply of labor for solving society's problems. While applications in domains such as citizen-science, citizen-journalism or knowledge organization (e.g., Wikipedia) have seen many successful appl… ▽ More

    Submitted 29 June, 2014; originally announced June 2014.

    Report number: ci-2014/118