-
The Galactica database: an open, generic and versatile tool for the dissemination of simulation data in astrophysics
Authors:
Damien Chapon,
Patrick Hennebelle
Abstract:
The Galactica simulation database is a platform designed to assist computational astrophysicists with their open science approach based on FAIR (Findable, Accessible, Interoperable, Reusable) principles. It offers the means to publish their numerical simulation projects, whatever their field of application or research theme and provides access to reduced datasets and object catalogs online. The ap…
▽ More
The Galactica simulation database is a platform designed to assist computational astrophysicists with their open science approach based on FAIR (Findable, Accessible, Interoperable, Reusable) principles. It offers the means to publish their numerical simulation projects, whatever their field of application or research theme and provides access to reduced datasets and object catalogs online. The application implements the Simulation Datamodel IVOA standard. To provide the scientific community indirect access to raw simulation data, Galactica can generate, on an "on-demand" basis, custom high-level data products to meet specific user requirements. These data products, accessible through online WebServices, are produced remotely from the raw simulation datasets. To that end, the Galactica central web application communicates with a high-scalability ecosystem of data-processing servers called Terminus by means of an industry-proven asynchronous task management system. Each Terminus node, hosted in a research institute, a regional or national supercomputing facility, contributes to the ecosystem by providing both the storage and the computational resources required to store the massive simulation datasets and post-process them to create the data products requested on Galactica, hence guaranteeing fine-grained sovereignty over data and resources. This distributed architecture is very versatile, it can be interfaced with any kind of data-processing software, written in any language, handling raw data produced by every type of simulation code used in the field of computational astrophysics. Its generality and versatility, together with its excellent scalability makes it a powerful tool for the scientific community to disseminate numerical models in astrophysics in the exascale era.
△ Less
Submitted 13 November, 2024;
originally announced November 2024.
-
LightAMR format standard and lossless compression algorithms for adaptive mesh refinement grids: RAMSES use case
Authors:
Loïc Strafella,
Damien Chapon
Abstract:
The evolution of parallel I/O library as well as new concepts such as 'in transit' and 'in situ' visualization and analysis have been identified as key technologies to circumvent I/O bottleneck in pre-exascale applications. Nevertheless, data structure and data format can also be improved for both reducing I/O volume and improving data interoperability between data producer and data consumer. In t…
▽ More
The evolution of parallel I/O library as well as new concepts such as 'in transit' and 'in situ' visualization and analysis have been identified as key technologies to circumvent I/O bottleneck in pre-exascale applications. Nevertheless, data structure and data format can also be improved for both reducing I/O volume and improving data interoperability between data producer and data consumer. In this paper, we propose a very lightweight and purpose-specific post-processing data model for AMR meshes, called lightAMR. Based on this data model, we introduce a tree pruning algorithm that removes data redundancy from a fully threaded AMR octree. In addition, we present two lossless compression algorithms, one for the AMR grid structure description and one for AMR double/single precision physical quantity scalar fields. Then we present performance benchmarks on RAMSES simulation datasets of this new lightAMR data model and the pruning and compression algorithms. We show that our pruning algorithm can reduce the total number of cells from RAMSES AMR datasets by 10-40% without loss of information. Finally, we show that the RAMSES AMR grid structure can be compacted by ~ 3 orders of magnitude and the float scalar fields can be compressed by a factor ~ 1.2 for double precision and ~ 1.3 - 1.5 in single precision with a compression speed of ~ 1 GB/s.
△ Less
Submitted 25 August, 2022;
originally announced August 2022.
-
Boosting I/O and visualization for exascale era using Hercule: test case on RAMSES
Authors:
Loic Strafella,
Damien Chapon
Abstract:
It has been clearly identified that I/O is one of the bottleneck to extend application for the exascale era. New concepts such as 'in transit' and 'in situ' visualization and analysis have been identified as key technologies to circumvent this particular issue. A new parallel I/O and data management library called Hercule, developed at CEA-DAM, has been integrated to Ramses, an AMR simulation code…
▽ More
It has been clearly identified that I/O is one of the bottleneck to extend application for the exascale era. New concepts such as 'in transit' and 'in situ' visualization and analysis have been identified as key technologies to circumvent this particular issue. A new parallel I/O and data management library called Hercule, developed at CEA-DAM, has been integrated to Ramses, an AMR simulation code for self-gravitating fluids. Splitting the original Ramses output format in Hercule database formats dedicated to either checkpoints/restarts (HProt format) or post-processing (HDep format) not only improved I/O performance and scalability of the Ramses code but also introduced much more flexibility in the simulation outputs to help astrophysicists prepare their DMP (Data Management Plan). Furthermore, the very lightweight and purpose-specific post-processing format (HDep) will significantly improve the overall performance of analysis and visualization tools such as PyMSES 5. An introduction to the Hercule parallel I/O library as well as I/O benchmark results will be discussed.
△ Less
Submitted 4 June, 2020;
originally announced June 2020.
-
Volume Rendering of AMR Simulations
Authors:
Marc Labadens,
Daniel Pomarède,
Damien Chapon,
Romain Teyssier,
Frédéric Bournaud,
Florent Renaud,
Nicolas Grandjouan
Abstract:
High-resolution simulations often rely on the Adaptive Mesh Resolution (AMR) technique to optimize memory consumption versus attainable precision. While this technique allows for dramatic improvements in terms of computing performance, the analysis and visualization of its data outputs remain challenging. The lack of effective volume renderers for the octree-based AMR used by the RAMSES simulation…
▽ More
High-resolution simulations often rely on the Adaptive Mesh Resolution (AMR) technique to optimize memory consumption versus attainable precision. While this technique allows for dramatic improvements in terms of computing performance, the analysis and visualization of its data outputs remain challenging. The lack of effective volume renderers for the octree-based AMR used by the RAMSES simulation program has led to the development of the solutions presented in this paper. Two custom algorithms are discussed, based on the splatting and the ray-casting techniques. Their usage is illustrated in the context of the visualization of a high-resolution, 6000-processor simulation of a Milky Way-like galaxy. Performance obtained in terms of memory management and parallelism speedup are presented.
△ Less
Submitted 1 January, 2013; v1 submitted 30 October, 2012;
originally announced October 2012.