-
Distributed Augmentation, Hypersweeps, and Branch Decomposition of Contour Trees for Scientific Exploration
Authors:
Mingzhe Li,
Hamish Carr,
Oliver Rübel,
Bei Wang,
Gunther H. Weber
Abstract:
Contour trees describe the topology of level sets in scalar fields and are widely used in topological data analysis and visualization. A main challenge of utilizing contour trees for large-scale scientific data is their computation at scale using high-performance computing. To address this challenge, recent work has introduced distributed hierarchical contour trees for distributed computation and…
▽ More
Contour trees describe the topology of level sets in scalar fields and are widely used in topological data analysis and visualization. A main challenge of utilizing contour trees for large-scale scientific data is their computation at scale using high-performance computing. To address this challenge, recent work has introduced distributed hierarchical contour trees for distributed computation and storage of contour trees. However, effective use of these distributed structures in analysis and visualization requires subsequent computation of geometric properties and branch decomposition to support contour extraction and exploration. In this work, we introduce distributed algorithms for augmentation, hypersweeps, and branch decomposition that enable parallel computation of geometric properties, and support the use of distributed contour trees as query structures for scientific exploration. We evaluate the parallel performance of these algorithms and apply them to identify and extract important contours for scientific visualization.
△ Less
Submitted 30 September, 2024; v1 submitted 8 August, 2024;
originally announced August 2024.
-
Methods for Linking Data to Online Resources and Ontologies with Applications to Neurophysiology
Authors:
Matthew Avaylon,
Ryan Ly,
Andrew Tritt,
Benjamin Dichter,
Kristofer E. Bouchard,
Christopher J. Mungall,
Oliver Ruebel
Abstract:
Across many domains, large swaths of digital assets are being stored across distributed data repositories, e.g., the DANDI Archive [8]. The distribution and diversity of these repositories impede researchers from formally defining terminology within experiments, integrating information across datasets, and easily querying, reusing, and analyzing data that follow the FAIR principles [15]. As such,…
▽ More
Across many domains, large swaths of digital assets are being stored across distributed data repositories, e.g., the DANDI Archive [8]. The distribution and diversity of these repositories impede researchers from formally defining terminology within experiments, integrating information across datasets, and easily querying, reusing, and analyzing data that follow the FAIR principles [15]. As such, it has become increasingly important to have a standardized method to attach contextual metadata to datasets. Neuroscience is an exemplary use case of this issue due to the complex multimodal nature of experiments. Here, we present the HDMF External Resources Data (HERD) standard and related tools, enabling researchers to annotate new and existing datasets by mapping external references to the data without requiring modification of the original dataset. We integrated HERD closely with Neurodata Without Borders (NWB) [2], a widely used data standard for sharing and storing neurophysiology data. By integrating with NWB, our tools provide neuroscientists with the capability to more easily create and manage neurophysiology data in compliance with controlled sets of terms, enhancing rigor and accuracy of data and facilitating data reuse.
△ Less
Submitted 30 May, 2024;
originally announced June 2024.
-
FAIR for AI: An interdisciplinary and international community building perspective
Authors:
E. A. Huerta,
Ben Blaiszik,
L. Catherine Brinson,
Kristofer E. Bouchard,
Daniel Diaz,
Caterina Doglioni,
Javier M. Duarte,
Murali Emani,
Ian Foster,
Geoffrey Fox,
Philip Harris,
Lukas Heinrich,
Shantenu Jha,
Daniel S. Katz,
Volodymyr Kindratenko,
Christine R. Kirkpatrick,
Kati Lassila-Perini,
Ravi K. Madduri,
Mark S. Neubauer,
Fotis E. Psomopoulos,
Avik Roy,
Oliver Rübel,
Zhizhen Zhao,
Ruike Zhu
Abstract:
A foundational set of findable, accessible, interoperable, and reusable (FAIR) principles were proposed in 2016 as prerequisites for proper data management and stewardship, with the goal of enabling the reusability of scholarly data. The principles were also meant to apply to other digital assets, at a high level, and over time, the FAIR guiding principles have been re-interpreted or extended to i…
▽ More
A foundational set of findable, accessible, interoperable, and reusable (FAIR) principles were proposed in 2016 as prerequisites for proper data management and stewardship, with the goal of enabling the reusability of scholarly data. The principles were also meant to apply to other digital assets, at a high level, and over time, the FAIR guiding principles have been re-interpreted or extended to include the software, tools, algorithms, and workflows that produce data. FAIR principles are now being adapted in the context of AI models and datasets. Here, we present the perspectives, vision, and experiences of researchers from different countries, disciplines, and backgrounds who are leading the definition and adoption of FAIR principles in their communities of practice, and discuss outcomes that may result from pursuing and incentivizing FAIR AI research. The material for this report builds on the FAIR for AI Workshop held at Argonne National Laboratory on June 7, 2022.
△ Less
Submitted 1 August, 2023; v1 submitted 30 September, 2022;
originally announced October 2022.